no code implementations • 9 Mar 2023 • Ben Adlam, Jaehoon Lee, Shreyas Padhy, Zachary Nado, Jasper Snoek
Using this approach, we study scaling laws of several neural kernels across many orders of magnitude for the CIFAR-5m dataset.
1 code implementation • 15 Jul 2022 • Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, Balaji Lakshminarayanan
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures.
1 code implementation • 7 Jul 2022 • Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani
Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully.
2 code implementations • 1 May 2022 • Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zack Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan
The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles.
no code implementations • 15 Dec 2021 • Setareh Ariafar, Justin Gilmer, Zack Nado, Jasper Snoek, Rodolphe Jenatton, George E. Dahl
For example, when tuning hyperparameters for machine learning pipelines on a new problem given a limited budget, one must strike a balance between excluding potentially promising regions and keeping the search space small enough to be tractable.
no code implementations • 7 Oct 2021 • James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton
Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, lead to strong performance.
3 code implementations • 16 Sep 2021 • Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani
Contrary to a common expectation that BO is suited to optimizing black-box functions, it actually requires domain knowledge about those functions to deploy BO successfully.
2 code implementations • 7 Jun 2021 • Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal, Dustin Tran
In this paper we introduce Uncertainty Baselines: high-quality implementations of standard and state-of-the-art deep learning methods on a variety of tasks.
1 code implementation • 23 Apr 2021 • Samuel Kim, Peter Y. Lu, Charlotte Loh, Jamie Smith, Jasper Snoek, Marin Soljačić
Bayesian optimization (BO) is a popular paradigm for global optimization of expensive black-box functions, but there are many domains where the function is not completely a black-box.
no code implementations • ICLR 2021 • Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek
This gives us a better understanding of the implicit prior NNs place on function space and allows a direct comparison of the calibration of the NNGP and its finite-width analogue.
no code implementations • ICLR 2021 • Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W. Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran
Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness.
1 code implementation • 14 Oct 2020 • Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek
This gives us a better understanding of the implicit prior NNs place on function space and allows a direct comparison of the calibration of the NNGP and its finite-width analogue.
1 code implementation • ICLR 2021 • Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran
Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network.
1 code implementation • 6 Oct 2020 • Benjamin Kompa, Jasper Snoek, Andrew Beam
Uncertainty quantification for complex deep learning models is increasingly important as these techniques see growing use in high-stakes, real-world settings.
2 code implementations • NeurIPS 2020 • Alexey A. Gritsenko, Tim Salimans, Rianne van den Berg, Jasper Snoek, Nal Kalchbrenner
Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems.
no code implementations • 31 Jul 2020 • Ben Adlam, Jasper Snoek, Samuel L. Smith
Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set (the "cold posterior" effect).
no code implementations • 10 Jul 2020 • Shreyas Padhy, Zachary Nado, Jie Ren, Jeremiah Liu, Jasper Snoek, Balaji Lakshminarayanan
Accurate estimation of predictive uncertainty in modern neural networks is critical to achieve well calibrated predictions and detect out-of-distribution (OOD) inputs.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
2 code implementations • NeurIPS 2020 • Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton
Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration.
no code implementations • 19 Jun 2020 • Zachary Nado, Shreyas Padhy, D. Sculley, Alexander D'Amour, Balaji Lakshminarayanan, Jasper Snoek
Using this one line code change, we achieve state-of-the-art on recent covariate shift benchmarks and an mCE of 60. 28\% on the challenging ImageNet-C dataset; to our knowledge, this is the best result for any model that does not incorporate additional data augmentation or modification of the training pipeline.
1 code implementation • ICML 2020 • Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran
Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning.
no code implementations • 23 Feb 2020 • Setareh Ariafar, Zelda Mariet, Ehsan Elhamifar, Dana Brooks, Jennifer Dy, Jasper Snoek
Casting hyperparameter search as a multi-task Bayesian optimization problem over both hyperparameters and importance sampling design achieves the best of both worlds: by learning a parameterization of IS that trades-off evaluation complexity and quality, we improve upon Bayesian optimization state-of-the-art runtime and final validation error across a variety of datasets and complex neural architectures.
no code implementations • ICML 2020 • Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin
Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights.
1 code implementation • ICML 2020 • Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub Świątkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin
In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD.
1 code implementation • 14 Jan 2020 • Linh Tran, Bastiaan S. Veeling, Kevin Roth, Jakub Swiatkowski, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Sebastian Nowozin, Rodolphe Jenatton
As a result, the diversity of the ensemble predictions, stemming from each member, is lost.
no code implementations • 25 Sep 2019 • Jakub Świątkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin
Variational Bayesian Inference is a popular methodology for approximating posterior distributions in Bayesian neural networks.
no code implementations • 25 Sep 2019 • Marton Havasi, Jasper Snoek, Dustin Tran, Jonathan Gordon, José Miguel Hernández-Lobato
Variational inference (VI) is a popular approach for approximate Bayesian inference that is particularly promising for highly parameterized models such as deep neural networks.
5 code implementations • NeurIPS 2019 • Jie Ren, Peter J. Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark A. DePristo, Joshua V. Dillon, Balaji Lakshminarayanan
We propose a likelihood ratio method for deep generative models which effectively corrects for these confounding background statistics.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
2 code implementations • NeurIPS 2019 • Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D. Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, Jasper Snoek
Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive {\em uncertainty}.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Alexey A. Gritsenko, Jasper Snoek, Tim Salimans
Normalising Flows (NFs) are a class of likelihood-based generative models that have recently gained popularity.
no code implementations • ICLR 2019 • Zelda Mariet, Yaniv Ovadia, Jasper Snoek
Determinantal Point Processes (DPPs) provide an elegant and versatile way to sample sets of items that balance the point-wise quality with the set-wise diversity of selected items.
no code implementations • 18 Dec 2018 • D. Sculley, Jasper Snoek, Alex Wiltschko
In this position paper, we argue that a tragedy of the commons outcome may be avoided by emphasizing the professional aspects of this service.
4 code implementations • ICLR 2018 • Carlos Riquelme, George Tucker, Jasper Snoek
At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical.
Ranked #1 on
Multi-Armed Bandits
on Mushroom
2 code implementations • ICLR 2018 • Gonzalo Mena, David Belanger, Scott Linderman, Jasper Snoek
Permutations and matchings are core building blocks in a variety of latent variable models, as they allow us to align, canonicalize, and sort data.
no code implementations • NeurIPS 2015 • Oren Rippel, Jasper Snoek, Ryan P. Adams
In this work, we demonstrate that, beyond its advantages for efficient computation, the spectral domain also provides a powerful representation in which to model and train convolutional neural networks (CNNs).
Ranked #163 on
Image Classification
on CIFAR-100
4 code implementations • 19 Feb 2015 • Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, Ryan P. Adams
Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations.
Ranked #152 on
Image Classification
on CIFAR-100
(using extra training data)
no code implementations • 14 Sep 2014 • Kevin Swersky, David Duvenaud, Jasper Snoek, Frank Hutter, Michael A. Osborne
In practical Bayesian optimization, we must often search over structures with differing numbers of parameters.
1 code implementation • 16 Jun 2014 • Kevin Swersky, Jasper Snoek, Ryan Prescott Adams
In this paper we develop a dynamic form of Bayesian optimization for machine learning models with the goal of rapidly finding good hyperparameter settings.
1 code implementation • 22 Mar 2014 • Michael A. Gelbart, Jasper Snoek, Ryan P. Adams
Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions.
1 code implementation • 5 Feb 2014 • Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams
Bayesian optimization has proven to be a highly effective methodology for the global optimization of unknown, expensive and multimodal functions.
no code implementations • NeurIPS 2013 • Jasper Snoek, Richard Zemel, Ryan P. Adams
Point processes are popular models of neural spiking behavior as they provide a statistical distribution over temporal sequences of spikes and help to reveal the complexities underlying a series of recorded action potentials.
1 code implementation • NeurIPS 2013 • Kevin Swersky, Jasper Snoek, Ryan P. Adams
We demonstrate the utility of this new acquisition function by utilizing a small dataset in order to explore hyperparameter settings for a large dataset.
Ranked #93 on
Image Classification
on STL-10
4 code implementations • NeurIPS 2012 • Jasper Snoek, Hugo Larochelle, Ryan P. Adams
In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP).
Ranked #185 on
Image Classification
on CIFAR-10