Search Results for author: Joe Benton

Found 7 papers, 2 papers with code

Measuring Feature Sparsity in Language Models

no code implementations • 11 Oct 2023 • Mingyang Deng, Lucas Tao, Joe Benton

We show our metrics can predict the level of sparsity on synthetic sparse linear activations, and can distinguish between sparse linear data and several other distributions.

Language Modelling

Paper
Add Code

Nearly $d$-Linear Convergence Bounds for Diffusion Models via Stochastic Localization

no code implementations • 7 Aug 2023 • Joe Benton, Valentin De Bortoli, Arnaud Doucet, George Deligiannidis

We provide the first convergence bounds which are linear in the data dimension (up to logarithmic factors) assuming only finite second moments of the data distribution.

Denoising

Paper
Add Code

Error Bounds for Flow Matching Methods

no code implementations • 26 May 2023 • Joe Benton, George Deligiannidis, Arnaud Doucet

Previous work derived bounds on the approximation error of diffusion models under the stochastic sampling regime, given assumptions on the $L^2$ loss.

Denoising

Paper
Add Code

From Denoising Diffusions to Denoising Markov Models

1 code implementation • 7 Nov 2022 • Joe Benton, Yuyang Shi, Valentin De Bortoli, George Deligiannidis, Arnaud Doucet

We propose a unifying framework generalising this approach to a wide class of spaces and leading to an original extension of score matching.

Denoising

Paper
Code

Alpha-divergence Variational Inference Meets Importance Weighted Auto-Encoders: Methodology and Asymptotics

no code implementations • NeurIPS 2023 • Kamélia Daudel, Joe Benton, Yuyang Shi, Arnaud Doucet

We then provide two complementary theoretical analyses of the VR-IWAE bound and thus of the standard IWAE bound.

Variational Inference

Paper
Add Code

Polysemanticity and Capacity in Neural Networks

no code implementations • 4 Oct 2022 • Adam Scherlis, Kshitij Sachan, Adam S. Jermyn, Joe Benton, Buck Shlegeris

We show that in a toy model the optimal capacity allocation tends to monosemantically represent the most important features, polysemantically represent less important features (in proportion to their impact on the loss), and entirely ignore the least important features.

Paper
Add Code

A Continuous Time Framework for Discrete Denoising Models

1 code implementation • 30 May 2022 • Andrew Campbell, Joe Benton, Valentin De Bortoli, Tom Rainforth, George Deligiannidis, Arnaud Doucet

We provide the first complete continuous time framework for denoising diffusion models of discrete data.

Denoising

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.