Search Results for author: Liam Hodgkinson

Found 16 papers, 5 papers with code

A PAC-Bayesian Perspective on the Interpolating Information Criterion

no code implementations13 Nov 2023 Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

Deep learning is renowned for its theory-practice gap, whereby principled theory typically fails to provide much beneficial guidance for implementation in practice.

The Interpolating Information Criterion for Overparameterized Models

no code implementations15 Jul 2023 Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset.

Model Selection

Generalization Guarantees via Algorithm-dependent Rademacher Complexity

no code implementations4 Jul 2023 Sarah Sachs, Tim van Erven, Liam Hodgkinson, Rajiv Khanna, Umut Simsekli

Algorithm- and data-dependent generalization bounds are required to explain the generalization behavior of modern machine learning algorithms.

Generalization Bounds

Monotonicity and Double Descent in Uncertainty Estimation with Gaussian Processes

no code implementations14 Oct 2022 Liam Hodgkinson, Chris van der Heide, Fred Roosta, Michael W. Mahoney

One prominent issue is the curse of dimensionality: it is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.

Gaussian Processes Uncertainty Quantification

Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows

no code implementations16 May 2022 Feynman Liang, Liam Hodgkinson, Michael W. Mahoney

While fat-tailed densities commonly arise as posterior and marginal distributions in robust models and scale mixtures, they present challenges when Gaussian-based variational inference fails to capture tail decay accurately.

Variational Inference

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

1 code implementation6 Feb 2022 Yaoqing Yang, Ryan Theisen, Liam Hodgkinson, Joseph E. Gonzalez, Kannan Ramchandran, Charles H. Martin, Michael W. Mahoney

Our analyses consider (I) hundreds of Transformers trained in different settings, in which we systematically vary the amount of data, the model size and the optimization hyperparameters, (II) a total of 51 pretrained Transformers from eight families of Huggingface NLP models, including GPT2, BERT, etc., and (III) a total of 28 existing and novel generalization metrics.

Model Selection

Generalization Bounds using Lower Tail Exponents in Stochastic Optimizers

no code implementations2 Aug 2021 Liam Hodgkinson, Umut Şimşekli, Rajiv Khanna, Michael W. Mahoney

Despite the ubiquitous use of stochastic optimization algorithms in machine learning, the precise impact of these algorithms and their dynamics on generalization performance in realistic non-convex settings is still poorly understood.

Generalization Bounds Stochastic Optimization

Taxonomizing local versus global structure in neural network loss landscapes

1 code implementation NeurIPS 2021 Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

Viewing neural network models in terms of their loss landscapes has a long history in the statistical mechanics approach to learning, and in recent years it has received attention within machine learning proper.

Stateful ODE-Nets using Basis Function Expansions

3 code implementations NeurIPS 2021 Alejandro Queiruga, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney

The recently-introduced class of ordinary differential equation networks (ODE-Nets) establishes a fruitful connection between deep learning and dynamical systems.

Image Classification Sentence

Noisy Recurrent Neural Networks

1 code implementation NeurIPS 2021 Soon Hoe Lim, N. Benjamin Erichson, Liam Hodgkinson, Michael W. Mahoney

We provide a general framework for studying recurrent neural networks (RNNs) trained by injecting noise into hidden states.

General Classification

Lipschitz Recurrent Neural Networks

1 code implementation ICLR 2021 N. Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam Hodgkinson, Michael W. Mahoney

Viewing recurrent neural networks (RNNs) as continuous-time dynamical systems, we propose a recurrent unit that describes the hidden state's evolution with two parts: a well-understood linear component plus a Lipschitz nonlinearity.

Language Modelling Sequential Image Classification

Multiplicative noise and heavy tails in stochastic optimization

no code implementations11 Jun 2020 Liam Hodgkinson, Michael W. Mahoney

Although stochastic optimization is central to modern machine learning, the precise mechanisms underlying its success, and in particular, the precise role of the stochasticity, still remain unclear.

Stochastic Optimization

Stochastic Normalizing Flows

no code implementations NeurIPS 2020 Liam Hodgkinson, Chris van der Heide, Fred Roosta, Michael W. Mahoney

We introduce stochastic normalizing flows, an extension of continuous normalizing flows for maximum likelihood estimation and variational inference (VI) using stochastic differential equations (SDEs).

Variational Inference

The reproducing Stein kernel approach for post-hoc corrected sampling

no code implementations25 Jan 2020 Liam Hodgkinson, Robert Salomone, Fred Roosta

Stein importance sampling is a widely applicable technique based on kernelized Stein discrepancy, which corrects the output of approximate sampling algorithms by reweighting the empirical distribution of the samples.

valid

Geometric Rates of Convergence for Kernel-based Sampling Algorithms

no code implementations19 Jul 2019 Rajiv Khanna, Liam Hodgkinson, Michael W. Mahoney

The rate of convergence of weighted kernel herding (WKH) and sequential Bayesian quadrature (SBQ), two kernel-based sampling algorithms for estimating integrals with respect to some target probability measure, is investigated.

Implicit Langevin Algorithms for Sampling From Log-concave Densities

no code implementations29 Mar 2019 Liam Hodgkinson, Robert Salomone, Fred Roosta

Theoretical and algorithmic properties of the resulting sampling methods for $ \theta \in [0, 1] $ and a range of step sizes are established.

Cannot find the paper you are looking for? You can Submit a new open access paper.