Search Results for author: Balaji Lakshminarayanan

Found 61 papers, 31 papers with code

Morse Neural Networks for Uncertainty Quantification

no code implementations2 Jul 2023 Benoit Dherin, Huiyi Hu, Jie Ren, Michael W. Dusenberry, Balaji Lakshminarayanan

We introduce a new deep generative model useful for uncertainty quantification: the Morse neural network, which generalizes the unnormalized Gaussian densities to have modes of high-dimensional submanifolds instead of just discrete points.

Anomaly Detection One-class classifier

Building One-class Detector for Anything: Open-vocabulary Zero-shot OOD Detection Using Text-image Models

no code implementations26 May 2023 Yunhao Ge, Jie Ren, Jiaping Zhao, KaiFeng Chen, Andrew Gallagher, Laurent Itti, Balaji Lakshminarayanan

Despite considerable effort, the problem remains significantly challenging in deep learning models due to their propensity to output over-confident predictions for OOD inputs.

Out of Distribution (OOD) Detection

What Are Effective Labels for Augmented Data? Improving Calibration and Robustness with AutoLabel

no code implementations22 Feb 2023 Yao Qin, Xuezhi Wang, Balaji Lakshminarayanan, Ed H. Chi, Alex Beutel

A wide breadth of research has devised data augmentation approaches that can improve both accuracy and generalization performance for neural networks.

Data Augmentation

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

no code implementations13 Feb 2023 James Urquhart Allingham, Jie Ren, Michael W Dusenberry, Xiuye Gu, Yin Cui, Dustin Tran, Jeremiah Zhe Liu, Balaji Lakshminarayanan

In particular, we ask "Given a large pool of prompts, can we automatically score the prompts and ensemble those that are most suitable for a particular downstream dataset, without needing access to labeled validation data?".

Prompt Engineering Zero-Shot Learning

Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play

no code implementations11 Feb 2023 Jeremiah Zhe Liu, Krishnamurthy Dj Dvijotham, Jihyeon Lee, Quan Yuan, Martin Strobel, Balaji Lakshminarayanan, Deepak Ramachandran

Standard empirical risk minimization (ERM) training can produce deep neural network (DNN) models that are accurate on average but under-perform in under-represented population subgroups, especially when there are imbalanced group distributions in the long-tailed training data.

Active Learning Fairness

Improving the Robustness of Summarization Models by Detecting and Removing Input Noise

no code implementations20 Dec 2022 Kundan Krishna, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu

We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes.

Abstractive Text Summarization

Out-of-Distribution Detection and Selective Generation for Conditional Language Models

no code implementations30 Sep 2022 Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu

Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output.

Abstractive Text Summarization Out-of-Distribution Detection +1

A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness

2 code implementations1 May 2022 Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zack Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan

The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles.

Data Augmentation Probabilistic Deep Learning

Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift

no code implementations25 Nov 2021 Kehang Han, Balaji Lakshminarayanan, Jeremiah Liu

The concern of overconfident mis-predictions under distributional shift demands extensive reliability research on Graph Neural Networks used in critical tasks in drug discovery.

Drug Discovery

Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation

no code implementations15 Oct 2021 Yao Qin, Chiyuan Zhang, Ting Chen, Balaji Lakshminarayanan, Alex Beutel, Xuezhi Wang

We show that patch-based negative augmentation consistently improves robustness of ViTs across a wide set of ImageNet based robustness benchmarks.

Data Augmentation

Sparse MoEs meet Efficient Ensembles

1 code implementation7 Oct 2021 James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models.

Few-Shot Learning

Deep Classifiers with Label Noise Modeling and Distance Awareness

no code implementations6 Oct 2021 Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications.

Out-of-Distribution Detection

Soft Calibration Objectives for Neural Networks

no code implementations NeurIPS 2021 Archit Karandikar, Nicholas Cain, Dustin Tran, Balaji Lakshminarayanan, Jonathon Shlens, Michael C. Mozer, Becca Roelofs

When incorporated into training, these soft calibration losses achieve state-of-the-art single-model ECE across multiple datasets with less than 1% decrease in accuracy.

Decision Making

An Instance-Dependent Simulation Framework for Learning with Label Noise

1 code implementation23 Jul 2021 Keren Gu, Xander Masotto, Vandana Bachani, Balaji Lakshminarayanan, Jack Nikodem, Dong Yin

We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm.

Learning with noisy labels

BEDS-Bench: Behavior of EHR-models under Distributional Shift--A Benchmark

1 code implementation17 Jul 2021 Anand Avati, Martin Seneviratne, Emily Xue, Zhen Xu, Balaji Lakshminarayanan, Andrew M. Dai

Most ML approaches focus on generalization performance on unseen data that are similar to the training data (In-Distribution, or IND).

Task-agnostic Continual Learning with Hybrid Probabilistic Models

no code implementations ICML Workshop INNF 2021 Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu

Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning.

Anomaly Detection Continual Learning +1

A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection

3 code implementations16 Jun 2021 Jie Ren, Stanislav Fort, Jeremiah Liu, Abhijit Guha Roy, Shreyas Padhy, Balaji Lakshminarayanan

Mahalanobis distance (MD) is a simple and popular post-processing method for detecting out-of-distribution (OOD) inputs in neural networks.

Intent Detection Out-of-Distribution Detection +1

Test Sample Accuracy Scales with Training Sample Density in Neural Networks

1 code implementation15 Jun 2021 Xu Ji, Razvan Pascanu, Devon Hjelm, Balaji Lakshminarayanan, Andrea Vedaldi

Intuitively, one would expect accuracy of a trained neural network's prediction on test samples to correlate with how densely the samples are surrounded by seen training samples in representation space.

Image Classification

What are effective labels for augmented data? Improving robustness with AutoLabel

no code implementations1 Jan 2021 Yao Qin, Xuezhi Wang, Balaji Lakshminarayanan, Ed Chi, Alex Beutel

Despite this, most existing work simply reuses the original label from the clean data, and the choice of label accompanying the augmented data is relatively less explored.

Adversarial Robustness Data Augmentation

Combining Ensembles and Data Augmentation can Harm your Calibration

no code implementations ICLR 2021 Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W. Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran

Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness.

Data Augmentation

Why Are Bootstrapped Deep Ensembles Not Better?

no code implementations NeurIPS Workshop ICBINB 2020 Jeremy Nixon, Balaji Lakshminarayanan, Dustin Tran

Ensemble methods have consistently reached state of the art across predictive, uncertainty, and out-of-distribution robustness benchmarks.

Training independent subnetworks for robust prediction

1 code implementation ICLR 2021 Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network.

Bayesian Deep Ensembles via the Neural Tangent Kernel

3 code implementations NeurIPS 2020 Bobby He, Balaji Lakshminarayanan, Yee Whye Teh

We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK): a recent development in understanding the training dynamics of wide neural networks (NNs).

Gaussian Processes

Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift

no code implementations19 Jun 2020 Zachary Nado, Shreyas Padhy, D. Sculley, Alexander D'Amour, Balaji Lakshminarayanan, Jasper Snoek

Using this one line code change, we achieve state-of-the-art on recent covariate shift benchmarks and an mCE of 60. 28\% on the challenging ImageNet-C dataset; to our knowledge, this is the best result for any model that does not incorporate additional data augmentation or modification of the training pipeline.

Data Augmentation

Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness

3 code implementations NeurIPS 2020 Jeremiah Zhe Liu, Zi Lin, Shreyas Padhy, Dustin Tran, Tania Bedrax-Weiss, Balaji Lakshminarayanan

Bayesian neural networks (BNN) and deep ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model.

Density of States Estimation for Out-of-Distribution Detection

no code implementations16 Jun 2020 Warren R. Morningstar, Cusuh Ham, Andrew G. Gallagher, Balaji Lakshminarayanan, Alexander A. Alemi, Joshua V. Dillon

Drawing on the statistical physics notion of ``density of states,'' the DoSE decision rule avoids direct comparison of model probabilities, and instead utilizes the ``probability of the model probability,'' or indeed the frequency of any reasonable statistic.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +1

Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

1 code implementation ICML 2020 Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning.

Deep Ensembles: A Loss Landscape Perspective

1 code implementation5 Dec 2019 Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan

One possible explanation for this gap between theory and practice is that popular scalable variational Bayesian methods tend to focus on a single mode, whereas deep ensembles tend to explore diverse modes in function space.

Normalizing Flows for Probabilistic Modeling and Inference

6 code implementations5 Dec 2019 George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan

In this review, we attempt to provide such a perspective by describing flows through the lens of probabilistic modeling and inference.

Detecting Out-of-Distribution Inputs to Deep Generative Models Using Typicality

2 code implementations7 Jun 2019 Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Balaji Lakshminarayanan

To determine whether or not inputs reside in the typical set, we propose a statistically principled, easy-to-implement test using the empirical distribution of model likelihoods.

Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

2 code implementations NeurIPS 2019 Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, Jasper Snoek

Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive {\em uncertainty}.

Probabilistic Deep Learning

Hybrid Models with Deep and Invertible Features

1 code implementation7 Feb 2019 Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan

We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i. e. a normalizing flow).

Probabilistic Deep Learning

Adapting Auxiliary Losses Using Gradient Similarity

1 code implementation5 Dec 2018 Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan

One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations.

Atari Games reinforcement-learning +1

Do Deep Generative Models Know What They Don't Know?

4 code implementations ICLR 2019 Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan

A neural network deployed in the wild may be asked to make predictions for inputs that were drawn from a different distribution than that of the training data.

Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems

no code implementations24 Jul 2018 Timothy A. Mann, Sven Gowal, András György, Ray Jiang, Huiyi Hu, Balaji Lakshminarayanan, Prav Srinivasan

Predicting delayed outcomes is an important problem in recommender systems (e. g., if customers will finish reading an ebook).

Recommendation Systems

Distribution Matching in Variational Inference

no code implementations19 Feb 2018 Mihaela Rosca, Balaji Lakshminarayanan, Shakir Mohamed

With the increasingly widespread deployment of generative models, there is a mounting need for a deeper understanding of their behaviors and limitations.

Variational Inference

Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step

1 code implementation ICLR 2018 William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, Ian Goodfellow

Unlike other generative models, the data distribution is learned via a game between a generator (the generative model) and a discriminator (a teacher providing training signal) that each minimize their own cost.

Variational Approaches for Auto-Encoding Generative Adversarial Networks

6 code implementations15 Jun 2017 Mihaela Rosca, Balaji Lakshminarayanan, David Warde-Farley, Shakir Mohamed

In this paper, we develop a principle upon which auto-encoders can be combined with generative adversarial networks by exploiting the hierarchical structure of the generative model.

Variational Inference

The Cramer Distance as a Solution to Biased Wasserstein Gradients

2 code implementations ICLR 2018 Marc G. Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, Rémi Munos

We show that the Cram\'er distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences.

BIG-bench Machine Learning

Comparison of Maximum Likelihood and GAN-based training of Real NVPs

no code implementations15 May 2017 Ivo Danihelka, Balaji Lakshminarayanan, Benigno Uria, Daan Wierstra, Peter Dayan

We train a generator by maximum likelihood and we also train the same generator architecture by Wasserstein GAN.

One-Shot Learning

Learning Deep Nearest Neighbor Representations Using Differentiable Boundary Trees

1 code implementation28 Feb 2017 Daniel Zoran, Balaji Lakshminarayanan, Charles Blundell

We introduce a new method called differentiable boundary tree which allows for learning deep kNN representations.

Retrieval

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

25 code implementations NeurIPS 2017 Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell

Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks.

Learning in Implicit Generative Models

no code implementations11 Oct 2016 Shakir Mohamed, Balaji Lakshminarayanan

We frame GANs within the wider landscape of algorithms for learning in implicit generative models--models that only specify a stochastic procedure with which to generate data--and relate these ideas to modelling problems in related fields, such as econometrics and approximate Bayesian computation.

Density Ratio Estimation Econometrics +1

The Mondrian Kernel

no code implementations16 Jun 2016 Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh

We introduce the Mondrian kernel, a fast random feature approximation to the Laplace kernel.

Distributed Bayesian Learning with Stochastic Natural-gradient Expectation Propagation and the Posterior Server

no code implementations31 Dec 2015 Leonard Hasenclever, Stefan Webb, Thibaut Lienart, Sebastian Vollmer, Balaji Lakshminarayanan, Charles Blundell, Yee Whye Teh

The posterior server allows scalable and robust Bayesian learning in cases where a data set is stored in a distributed manner across a cluster, with each compute node containing a disjoint subset of data.

Variational Inference

Approximate Inference with the Variational Holder Bound

no code implementations19 Jun 2015 Guillaume Bouchard, Balaji Lakshminarayanan

We introduce the Variational Holder (VH) bound as an alternative to Variational Bayes (VB) for approximate Bayesian inference.

Bayesian Inference Numerical Integration

Kernel-Based Just-In-Time Learning for Passing Expectation Propagation Messages

1 code implementation9 Mar 2015 Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó

We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output.

regression

Particle Gibbs for Bayesian Additive Regression Trees

no code implementations16 Feb 2015 Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Additive regression trees are flexible non-parametric models and popular off-the-shelf tools for real-world non-linear regression.

regression

Distributed Bayesian Posterior Sampling via Moment Sharing

no code implementations NeurIPS 2014 Minjie Xu, Balaji Lakshminarayanan, Yee Whye Teh, Jun Zhu, Bo Zhang

We propose a distributed Markov chain Monte Carlo (MCMC) inference algorithm for large scale Bayesian posterior simulation.

regression

Mondrian Forests: Efficient Online Random Forests

2 code implementations NeurIPS 2014 Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics.

Inferring ground truth from multi-annotator ordinal data: a probabilistic approach

no code implementations30 Apr 2013 Balaji Lakshminarayanan, Yee Whye Teh

A popular approach for large scale data annotation tasks is crowdsourcing, wherein each data point is labeled by multiple noisy annotators.

Bayesian Inference

Top-down particle filtering for Bayesian decision trees

no code implementations3 Mar 2013 Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

Unlike classic decision tree learning algorithms like ID3, C4. 5 and CART, which work in a top-down manner, existing Bayesian algorithms produce an approximation to the posterior distribution by evolving a complete tree (or collection thereof) iteratively via local Monte Carlo modifications to the structure of the tree, e. g., using Markov chain Monte Carlo (MCMC).

Cannot find the paper you are looking for? You can Submit a new open access paper.