no code implementations • 2 Jul 2023 • Benoit Dherin, Huiyi Hu, Jie Ren, Michael W. Dusenberry, Balaji Lakshminarayanan
We introduce a new deep generative model useful for uncertainty quantification: the Morse neural network, which generalizes the unnormalized Gaussian densities to have modes of high-dimensional submanifolds instead of just discrete points.
no code implementations • 26 May 2023 • Yunhao Ge, Jie Ren, Jiaping Zhao, KaiFeng Chen, Andrew Gallagher, Laurent Itti, Balaji Lakshminarayanan
Despite considerable effort, the problem remains significantly challenging in deep learning models due to their propensity to output over-confident predictions for OOD inputs.
no code implementations • 22 Feb 2023 • Yao Qin, Xuezhi Wang, Balaji Lakshminarayanan, Ed H. Chi, Alex Beutel
A wide breadth of research has devised data augmentation approaches that can improve both accuracy and generalization performance for neural networks.
no code implementations • 13 Feb 2023 • James Urquhart Allingham, Jie Ren, Michael W Dusenberry, Xiuye Gu, Yin Cui, Dustin Tran, Jeremiah Zhe Liu, Balaji Lakshminarayanan
In particular, we ask "Given a large pool of prompts, can we automatically score the prompts and ensemble those that are most suitable for a particular downstream dataset, without needing access to labeled validation data?".
no code implementations • 11 Feb 2023 • Jeremiah Zhe Liu, Krishnamurthy Dj Dvijotham, Jihyeon Lee, Quan Yuan, Martin Strobel, Balaji Lakshminarayanan, Deepak Ramachandran
Standard empirical risk minimization (ERM) training can produce deep neural network (DNN) models that are accurate on average but under-perform in under-represented population subgroups, especially when there are imbalanced group distributions in the long-tailed training data.
no code implementations • 20 Dec 2022 • Kundan Krishna, Yao Zhao, Jie Ren, Balaji Lakshminarayanan, Jiaming Luo, Mohammad Saleh, Peter J. Liu
We present a large empirical study quantifying the sometimes severe loss in performance (up to 12 ROUGE-1 points) from different types of input noise for a range of datasets and model sizes.
1 code implementation • CVPR 2023 • Yunhao Ge, Jie Ren, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, Jiaping Zhao
We also show that our method improves across ImageNet shifted datasets, four other datasets, and other model architectures such as LiT.
no code implementations • 30 Sep 2022 • Jie Ren, Jiaming Luo, Yao Zhao, Kundan Krishna, Mohammad Saleh, Balaji Lakshminarayanan, Peter J. Liu
Furthermore, the space of potential low-quality outputs is larger as arbitrary text can be generated and it is important to know when to trust the generated output.
Abstractive Text Summarization
Out-of-Distribution Detection
+1
1 code implementation • 15 Jul 2022 • Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, Balaji Lakshminarayanan
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures.
2 code implementations • 1 May 2022 • Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zack Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan
The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles.
no code implementations • 25 Nov 2021 • Kehang Han, Balaji Lakshminarayanan, Jeremiah Liu
The concern of overconfident mis-predictions under distributional shift demands extensive reliability research on Graph Neural Networks used in critical tasks in drug discovery.
no code implementations • 15 Oct 2021 • Yao Qin, Chiyuan Zhang, Ting Chen, Balaji Lakshminarayanan, Alex Beutel, Xuezhi Wang
We show that patch-based negative augmentation consistently improves robustness of ViTs across a wide set of ImageNet based robustness benchmarks.
1 code implementation • 7 Oct 2021 • James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton
Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models.
no code implementations • 6 Oct 2021 • Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou
Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications.
no code implementations • NeurIPS 2021 • Archit Karandikar, Nicholas Cain, Dustin Tran, Balaji Lakshminarayanan, Jonathon Shlens, Michael C. Mozer, Becca Roelofs
When incorporated into training, these soft calibration losses achieve state-of-the-art single-model ECE across multiple datasets with less than 1% decrease in accuracy.
1 code implementation • 23 Jul 2021 • Keren Gu, Xander Masotto, Vandana Bachani, Balaji Lakshminarayanan, Jack Nikodem, Dong Yin
We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm.
1 code implementation • 17 Jul 2021 • Anand Avati, Martin Seneviratne, Emily Xue, Zhen Xu, Balaji Lakshminarayanan, Andrew M. Dai
Most ML approaches focus on generalization performance on unseen data that are similar to the training data (In-Distribution, or IND).
no code implementations • ICML Workshop INNF 2021 • Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu
Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning.
3 code implementations • 16 Jun 2021 • Jie Ren, Stanislav Fort, Jeremiah Liu, Abhijit Guha Roy, Shreyas Padhy, Balaji Lakshminarayanan
Mahalanobis distance (MD) is a simple and popular post-processing method for detecting out-of-distribution (OOD) inputs in neural networks.
1 code implementation • 15 Jun 2021 • Xu Ji, Razvan Pascanu, Devon Hjelm, Balaji Lakshminarayanan, Andrea Vedaldi
Intuitively, one would expect accuracy of a trained neural network's prediction on test samples to correlate with how densely the samples are surrounded by seen training samples in representation space.
3 code implementations • 7 Jun 2021 • Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal, Dustin Tran
In this paper we introduce Uncertainty Baselines: high-quality implementations of standard and state-of-the-art deep learning methods on a variety of tasks.
1 code implementation • NeurIPS 2021 • Stanislav Fort, Jie Ren, Balaji Lakshminarayanan
Near out-of-distribution detection (OOD) is a major challenge for deep neural networks.
Ranked #2 on
Out-of-Distribution Detection
on CIFAR-10 vs CIFAR-100
(using extra training data)
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
+1
no code implementations • 8 Apr 2021 • Abhijit Guha Roy, Jie Ren, Shekoofeh Azizi, Aaron Loh, Vivek Natarajan, Basil Mustafa, Nick Pawlowski, Jan Freyberg, YuAn Liu, Zach Beaver, Nam Vo, Peggy Bui, Samantha Winter, Patricia MacWilliams, Greg S. Corrado, Umesh Telang, Yun Liu, Taylan Cemgil, Alan Karthikesalingam, Balaji Lakshminarayanan, Jim Winkens
We develop and rigorously evaluate a deep learning based system that can accurately classify skin conditions while detecting rare conditions for which there is not enough data available for training a confident classifier.
no code implementations • 1 Jan 2021 • Yao Qin, Xuezhi Wang, Balaji Lakshminarayanan, Ed Chi, Alex Beutel
Despite this, most existing work simply reuses the original label from the clean data, and the choice of label accompanying the augmented data is relatively less explored.
no code implementations • ICLR 2021 • Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W. Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran
Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness.
no code implementations • NeurIPS Workshop ICBINB 2020 • Jeremy Nixon, Balaji Lakshminarayanan, Dustin Tran
Ensemble methods have consistently reached state of the art across predictive, uncertainty, and out-of-distribution robustness benchmarks.
1 code implementation • ICLR 2021 • Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran
Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network.
3 code implementations • NeurIPS 2020 • Bobby He, Balaji Lakshminarayanan, Yee Whye Teh
We explore the link between deep ensembles and Gaussian processes (GPs) through the lens of the Neural Tangent Kernel (NTK): a recent development in understanding the training dynamics of wide neural networks (NNs).
no code implementations • 10 Jul 2020 • Shreyas Padhy, Zachary Nado, Jie Ren, Jeremiah Liu, Jasper Snoek, Balaji Lakshminarayanan
Accurate estimation of predictive uncertainty in modern neural networks is critical to achieve well calibrated predictions and detect out-of-distribution (OOD) inputs.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 19 Jun 2020 • Zachary Nado, Shreyas Padhy, D. Sculley, Alexander D'Amour, Balaji Lakshminarayanan, Jasper Snoek
Using this one line code change, we achieve state-of-the-art on recent covariate shift benchmarks and an mCE of 60. 28\% on the challenging ImageNet-C dataset; to our knowledge, this is the best result for any model that does not incorporate additional data augmentation or modification of the training pipeline.
3 code implementations • NeurIPS 2020 • Jeremiah Zhe Liu, Zi Lin, Shreyas Padhy, Dustin Tran, Tania Bedrax-Weiss, Balaji Lakshminarayanan
Bayesian neural networks (BNN) and deep ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model.
no code implementations • 16 Jun 2020 • Warren R. Morningstar, Cusuh Ham, Andrew G. Gallagher, Balaji Lakshminarayanan, Alexander A. Alemi, Joshua V. Dillon
Drawing on the statistical physics notion of ``density of states,'' the DoSE decision rule avoids direct comparison of model probabilities, and instead utilizes the ``probability of the model probability,'' or indeed the frequency of any reasonable statistic.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
+1
1 code implementation • ICML 2020 • Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran
Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning.
1 code implementation • 5 Dec 2019 • Stanislav Fort, Huiyi Hu, Balaji Lakshminarayanan
One possible explanation for this gap between theory and practice is that popular scalable variational Bayesian methods tend to focus on a single mode, whereas deep ensembles tend to explore diverse modes in function space.
6 code implementations • 5 Dec 2019 • George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, Balaji Lakshminarayanan
In this review, we attempt to provide such a perspective by describing flows through the lens of probabilistic modeling and inference.
15 code implementations • ICLR 2020 • Dan Hendrycks, Norman Mu, Ekin D. Cubuk, Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan
We propose AugMix, a data processing technique that is simple to implement, adds limited computational overhead, and helps models withstand unforeseen corruptions.
Ranked #1 on
Out-of-Distribution Generalization
on ImageNet-W
2 code implementations • 7 Jun 2019 • Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Balaji Lakshminarayanan
To determine whether or not inputs reside in the typical set, we propose a statistically principled, easy-to-implement test using the empirical distribution of model likelihoods.
5 code implementations • NeurIPS 2019 • Jie Ren, Peter J. Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark A. DePristo, Joshua V. Dillon, Balaji Lakshminarayanan
We propose a likelihood ratio method for deep generative models which effectively corrects for these confounding background statistics.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
2 code implementations • NeurIPS 2019 • Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, Jasper Snoek
Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive {\em uncertainty}.
1 code implementation • 7 Feb 2019 • Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan
We propose a neural hybrid model consisting of a linear model defined on a set of features computed by a deep, invertible transformation (i. e. a normalizing flow).
1 code implementation • 5 Dec 2018 • Yunshu Du, Wojciech M. Czarnecki, Siddhant M. Jayakumar, Mehrdad Farajtabar, Razvan Pascanu, Balaji Lakshminarayanan
One approach to deal with the statistical inefficiency of neural networks is to rely on auxiliary losses that help to build useful representations.
4 code implementations • ICLR 2019 • Eric Nalisnick, Akihiro Matsukawa, Yee Whye Teh, Dilan Gorur, Balaji Lakshminarayanan
A neural network deployed in the wild may be asked to make predictions for inputs that were drawn from a different distribution than that of the training data.
no code implementations • 24 Jul 2018 • Timothy A. Mann, Sven Gowal, András György, Ray Jiang, Huiyi Hu, Balaji Lakshminarayanan, Prav Srinivasan
Predicting delayed outcomes is an important problem in recommender systems (e. g., if customers will finish reading an ebook).
no code implementations • 19 Feb 2018 • Mihaela Rosca, Balaji Lakshminarayanan, Shakir Mohamed
With the increasingly widespread deployment of generative models, there is a mounting need for a deeper understanding of their behaviors and limitations.
1 code implementation • ICLR 2018 • William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, Ian Goodfellow
Unlike other generative models, the data distribution is learned via a game between a generator (the generative model) and a discriminator (a teacher providing training signal) that each minimize their own cost.
6 code implementations • 15 Jun 2017 • Mihaela Rosca, Balaji Lakshminarayanan, David Warde-Farley, Shakir Mohamed
In this paper, we develop a principle upon which auto-encoders can be combined with generative adversarial networks by exploiting the hierarchical structure of the generative model.
2 code implementations • ICLR 2018 • Marc G. Bellemare, Ivo Danihelka, Will Dabney, Shakir Mohamed, Balaji Lakshminarayanan, Stephan Hoyer, Rémi Munos
We show that the Cram\'er distance possesses all three desired properties, combining the best of the Wasserstein and Kullback-Leibler divergences.
no code implementations • 15 May 2017 • Ivo Danihelka, Balaji Lakshminarayanan, Benigno Uria, Daan Wierstra, Peter Dayan
We train a generator by maximum likelihood and we also train the same generator architecture by Wasserstein GAN.
1 code implementation • 28 Feb 2017 • Daniel Zoran, Balaji Lakshminarayanan, Charles Blundell
We introduce a new method called differentiable boundary tree which allows for learning deep kNN representations.
25 code implementations • NeurIPS 2017 • Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell
Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks.
no code implementations • 11 Oct 2016 • Shakir Mohamed, Balaji Lakshminarayanan
We frame GANs within the wider landscape of algorithms for learning in implicit generative models--models that only specify a stochastic procedure with which to generate data--and relate these ideas to modelling problems in related fields, such as econometrics and approximate Bayesian computation.
no code implementations • 16 Jun 2016 • Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh
We introduce the Mondrian kernel, a fast random feature approximation to the Laplace kernel.
no code implementations • 31 Dec 2015 • Leonard Hasenclever, Stefan Webb, Thibaut Lienart, Sebastian Vollmer, Balaji Lakshminarayanan, Charles Blundell, Yee Whye Teh
The posterior server allows scalable and robust Bayesian learning in cases where a data set is stored in a distributed manner across a cluster, with each compute node containing a disjoint subset of data.
no code implementations • 19 Jun 2015 • Guillaume Bouchard, Balaji Lakshminarayanan
We introduce the Variational Holder (VH) bound as an alternative to Variational Bayes (VB) for approximate Bayesian inference.
1 code implementation • 11 Jun 2015 • Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Many real-world regression problems demand a measure of the uncertainty associated with each prediction.
1 code implementation • 9 Mar 2015 • Wittawat Jitkrittum, Arthur Gretton, Nicolas Heess, S. M. Ali Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, Zoltán Szabó
We propose an efficient nonparametric strategy for learning a message operator in expectation propagation (EP), which takes as input the set of incoming messages to a factor node, and produces an outgoing message as output.
no code implementations • 16 Feb 2015 • Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Additive regression trees are flexible non-parametric models and popular off-the-shelf tools for real-world non-linear regression.
no code implementations • NeurIPS 2014 • Minjie Xu, Balaji Lakshminarayanan, Yee Whye Teh, Jun Zhu, Bo Zhang
We propose a distributed Markov chain Monte Carlo (MCMC) inference algorithm for large scale Bayesian posterior simulation.
2 code implementations • NeurIPS 2014 • Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics.
no code implementations • 30 Apr 2013 • Balaji Lakshminarayanan, Yee Whye Teh
A popular approach for large scale data annotation tasks is crowdsourcing, wherein each data point is labeled by multiple noisy annotators.
no code implementations • 3 Mar 2013 • Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh
Unlike classic decision tree learning algorithms like ID3, C4. 5 and CART, which work in a top-down manner, existing Bayesian algorithms produce an approximation to the posterior distribution by evolving a complete tree (or collection thereof) iteratively via local Monte Carlo modifications to the structure of the tree, e. g., using Markov chain Monte Carlo (MCMC).