Search Results for author: Liam Hodgkinson

Found 16 papers, 5 papers with code

A PAC-Bayesian Perspective on the Interpolating Information Criterion

no code implementations • 13 Nov 2023 • Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

Deep learning is renowned for its theory-practice gap, whereby principled theory typically fails to provide much beneficial guidance for implementation in practice.

Paper
Add Code

The Interpolating Information Criterion for Overparameterized Models

no code implementations • 15 Jul 2023 • Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset.

Model Selection

Paper
Add Code

Generalization Guarantees via Algorithm-dependent Rademacher Complexity

no code implementations • 4 Jul 2023 • Sarah Sachs, Tim van Erven, Liam Hodgkinson, Rajiv Khanna, Umut Simsekli

Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms.

Generalization Bounds

Paper
Add Code

Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes

no code implementations • 14 Oct 2022 • Liam Hodgkinson, Chris van der Heide, Fred Roosta, Michael W. Mahoney

One prominent issue is the curse of dimensionality: it is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.

Gaussian Processes Uncertainty Quantification

Paper
Add Code

Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows

no code implementations • 16 May 2022 • Feynman Liang, Liam Hodgkinson, Michael W. Mahoney

While fat-tailed densities commonly arise as posterior and marginal distributions in robust models and scale mixtures, they present challenges when Gaussian-based variational inference fails to capture tail decay accurately.

Variational Inference

Paper
Add Code

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

1 code implementation • 6 Feb 2022 • Yaoqing Yang, Ryan Theisen, Liam Hodgkinson, Joseph E. Gonzalez, Kannan Ramchandran, Charles H. Martin, Michael W. Mahoney

Our analyses consider (I) hundreds of Transformers trained in different settings, in which we systematically vary the amount of data, the model size and the optimization hyperparameters, (II) a total of 51 pretrained Transformers from eight families of Huggingface NLP models, including GPT2, BERT, etc., and (III) a total of 28 existing and novel generalization metrics.

Model Selection

Paper
Code

Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

no code implementations • 2 Aug 2021 • Liam Hodgkinson, Umut Şimşekli, Rajiv Khanna, Michael W. Mahoney

Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms and their dynamics on generalization performance in realistic non-convex settings is still poorly understood.

Generalization Bounds Stochastic Optimization

Paper
Add Code

Taxonomizing local versus global structure in neural network loss landscapes

1 code implementation • NeurIPS 2021 • Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

Viewing neural network models in terms of their loss landscapes has a long history in the statistical mechanics approach to learning, and in recent years it has received attention within machine learning proper.

Paper
Code

Stateful ODE-Nets using Basis Function Expansions

3 code implementations • NeurIPS 2021 • Alejandro Queiruga, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney

The recently-introduced class of ordinary differential equation networks (ODE-Nets) establishes a fruitful connection between deep learning and dynamical systems.

Image Classification Sentence

Paper
Code

Noisy Recurrent Neural Networks

1 code implementation • NeurIPS 2021 • Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney

We provide a general framework for studying recurrent neural networks (RNNs) trained by injecting noise into hidden states.

General Classification

Paper
Code

Lipschitz Recurrent Neural Networks

1 code implementation • ICLR 2021 • N. Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam Hodgkinson, Michael W. Mahoney

Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity.

Ranked #10 on Sequential Image Classification on Sequential CIFAR-10

Language Modelling Sequential Image Classification

Paper
Code

Multiplicative noise and heavy tails in stochastic optimization

no code implementations • 11 Jun 2020 • Liam Hodgkinson, Michael W. Mahoney

Although stochastic optimization is central to modern machine learning, the precise mechanisms underlying its success, and in particular, the precise role of the stochasticity, still remain unclear.

Stochastic Optimization

Paper
Add Code

Stochastic Normalizing Flows

no code implementations • NeurIPS 2020 • Liam Hodgkinson, Chris van der Heide, Fred Roosta, Michael W. Mahoney

We introduce stochastic normalizing flows, an extension of continuous normalizing flows for maximum likelihood estimation and variational inference (VI) using stochastic differential equations (SDEs).

Variational Inference

Paper
Add Code

The reproducing Stein kernel approach for post-hoc corrected sampling

no code implementations • 25 Jan 2020 • Liam Hodgkinson, Robert Salomone, Fred Roosta

Stein importance sampling is a widely applicable technique based on kernelized Stein discrepancy, which corrects the output of approximate sampling algorithms by reweighting the empirical distribution of the samples.

valid

Paper
Add Code

Geometric Rates of Convergence for Kernel-based Sampling Algorithms

no code implementations • 19 Jul 2019 • Rajiv Khanna, Liam Hodgkinson, Michael W. Mahoney

The rate of convergence of weighted kernel herding (WKH) and sequential Bayesian quadrature (SBQ), two kernel-based sampling algorithms for estimating integrals with respect to some target probability measure, is investigated.

Paper
Add Code

Implicit Langevin Algorithms for Sampling From Log-concave Densities

no code implementations • 29 Mar 2019 • Liam Hodgkinson, Robert Salomone, Fred Roosta

Theoretical and algorithmic properties of the resulting sampling methods for $ \theta \in [0, 1] $ and a range of step sizes are established.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.