Search Results for author: Ben Adlam

Found 24 papers, 2 papers with code

Understanding Optimal Feature Transfer via a Fine-Grained Bias-Variance Analysis

no code implementations18 Apr 2024 Yufan Li, Subhabrata Sen, Ben Adlam

In the transfer learning paradigm models learn useful representations (or features) during a data-rich pretraining stage, and then use the pretrained representation to improve model performance on data-scarce downstream tasks.

Transfer Learning

Kernel Regression with Infinite-Width Neural Networks on Millions of Examples

no code implementations9 Mar 2023 Ben Adlam, Jaehoon Lee, Shreyas Padhy, Zachary Nado, Jasper Snoek

Using this approach, we study scaling laws of several neural kernels across many orders of magnitude for the CIFAR-5m dataset.

Data Augmentation regression

Ensembling over Classifiers: a Bias-Variance Perspective

no code implementations21 Jun 2022 Neha Gupta, Jamie Smith, Ben Adlam, Zelda Mariet

Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction. We conclude by an empirical analysis of recent deep learning methods that ensemble over hyperparameters, revealing that these techniques indeed favor bias reduction.

Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions

no code implementations15 Jun 2022 Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington

Stochastic gradient descent (SGD) is a pillar of modern machine learning, serving as the go-to optimization algorithm for a diverse array of problems.

Computational Efficiency

Homogenization of SGD in high-dimensions: Exact dynamics and generalization properties

no code implementations14 May 2022 Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington

By analyzing homogenized SGD, we provide exact non-asymptotic high-dimensional expressions for the generalization performance of SGD in terms of a solution of a Volterra integral equation.

Vocal Bursts Intensity Prediction

Understanding the bias-variance tradeoff of Bregman divergences

no code implementations8 Feb 2022 Ben Adlam, Neha Gupta, Zelda Mariet, Jamie Smith

We show that, similarly to the label, the central prediction can be interpreted as the mean of a random variable, where the mean operates in a dual space defined by the loss function itself.

Overparameterization Improves Robustness to Covariate Shift in High Dimensions

no code implementations NeurIPS 2021 Nilesh Tripuraneni, Ben Adlam, Jeffrey Pennington

A significant obstacle in the development of robust machine learning models is \emph{covariate shift}, a form of distribution shift that occurs when the input distributions of the training and test sets differ while the conditional label distributions remain the same.

BIG-bench Machine Learning Out-of-Distribution Generalization +1

Covariate Shift in High-Dimensional Random Feature Regression

no code implementations16 Nov 2021 Nilesh Tripuraneni, Ben Adlam, Jeffrey Pennington

A significant obstacle in the development of robust machine learning models is covariate shift, a form of distribution shift that occurs when the input distributions of the training and test sets differ while the conditional label distributions remain the same.

BIG-bench Machine Learning Out-of-Distribution Generalization +2

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

no code implementations ICLR 2021 Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek

This gives us a better understanding of the implicit prior NNs place on function space and allows a direct comparison of the calibration of the NNGP and its finite-width analogue.

General Classification Multi-class Classification +1

Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition

no code implementations NeurIPS 2020 Ben Adlam, Jeffrey Pennington

Classical learning theory suggests that the optimal generalization performance of a machine learning model should occur at an intermediate model complexity, with simpler models exhibiting high bias and more complex models exhibiting high variance of the predictive function.

Ensemble Learning Learning Theory

Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit

1 code implementation14 Oct 2020 Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek

This gives us a better understanding of the implicit prior NNs place on function space and allows a direct comparison of the calibration of the NNGP and its finite-width analogue.

General Classification Multi-class Classification +1

Finite Versus Infinite Neural Networks: an Empirical Study

no code implementations NeurIPS 2020 Jaehoon Lee, Samuel S. Schoenholz, Jeffrey Pennington, Ben Adlam, Lechao Xiao, Roman Novak, Jascha Sohl-Dickstein

We perform a careful, thorough, and large scale empirical study of the correspondence between wide neural networks and kernel methods.

Cold Posteriors and Aleatoric Uncertainty

no code implementations31 Jul 2020 Ben Adlam, Jasper Snoek, Samuel L. Smith

Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set (the "cold posterior" effect).

valid

The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks

no code implementations NeurIPS 2020 Wei Hu, Lechao Xiao, Ben Adlam, Jeffrey Pennington

Modern neural networks are often regarded as complex black-box functions whose behavior is difficult to understand owing to their nonlinear dependence on the data and the nonconvexity in their loss landscapes.

A Random Matrix Perspective on Mixtures of Nonlinearities for Deep Learning

no code implementations2 Dec 2019 Ben Adlam, Jake Levinson, Jeffrey Pennington

In this work, we focus on this high-dimensional regime in which both the dataset size and the number of features tend to infinity.

Investigating Under and Overfitting in Wasserstein Generative Adversarial Networks

no code implementations30 Oct 2019 Ben Adlam, Charles Weill, Amol Kapoor

We investigate under and overfitting in Generative Adversarial Networks (GANs), using discriminators unseen by the generator to measure generalization.

Learning GANs and Ensembles Using Discrepancy

no code implementations NeurIPS 2019 Ben Adlam, Corinna Cortes, Mehryar Mohri, Ningshan Zhang

Generative adversarial networks (GANs) generate data based on minimizing a divergence between two distributions.

Domain Adaptation

A Random Matrix Perspective on Mixtures of Nonlinearities in High Dimensions

no code implementations25 Sep 2019 Ben Adlam, Jake Levinson, Jeffrey Pennington

One of the distinguishing characteristics of modern deep learning systems is that they typically employ neural network architectures that utilize enormous numbers of parameters, often in the millions and sometimes even in the billions.

Vocal Bursts Intensity Prediction

AdaNet: A Scalable and Flexible Framework for Automatically Learning Ensembles

1 code implementation30 Apr 2019 Charles Weill, Javier Gonzalvo, Vitaly Kuznetsov, Scott Yang, Scott Yak, Hanna Mazzawi, Eugen Hotaj, Ghassen Jerfel, Vladimir Macko, Ben Adlam, Mehryar Mohri, Corinna Cortes

AdaNet is a lightweight TensorFlow-based (Abadi et al., 2015) framework for automatically learning high-quality ensembles with minimal expert intervention.

Neural Architecture Search

Cannot find the paper you are looking for? You can Submit a new open access paper.