Search Results for author: Bruno Loureiro

Found 30 papers, 19 papers with code

Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression

no code implementations • 21 Feb 2024 • Lucas Clarté, Adrien Vandenbroucque, Guillaume Dalle, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks.

regression

Paper
Add Code

Asymptotics of Learning with Deep Structured (Random) Features

no code implementations • 21 Feb 2024 • Dominik Schröder, Daniil Dmitriev, Hugo Cui, Bruno Loureiro

For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large.

Paper
Add Code

A High Dimensional Model for Adversarial Training: Geometry and Trade-Offs

1 code implementation • 8 Feb 2024 • Kasimir Tanner, Matteo Vilucchio, Bruno Loureiro, Florent Krzakala

This work investigates adversarial training in the context of margin-based linear classifiers in the high-dimensional regime where the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $\alpha = n / d$.

Adversarial Robustness

Paper
Code

Asymptotics of feature learning in two-layer networks after one gradient-step

1 code implementation • 7 Feb 2024 • Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

To our knowledge, our results provides the first tight description of the impact of feature learning in the generalization of two-layer neural networks in the large learning rate regime $\eta=\Theta_{d}(d)$, beyond perturbative finite width corrections of the conjugate and neural tangent kernels.

Paper
Code

High-dimensional robust regression under heavy-tailed data: Asymptotics and Universality

no code implementations • 28 Sep 2023 • Urte Adomaityte, Leonardo Defilippis, Bruno Loureiro, Gabriele Sicuro

In particular, we provide a sharp asymptotic characterisation of M-estimators trained on a family of elliptical covariate and noise data distributions including cases where second and higher moments do not exist.

regression

Paper
Add Code

Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD

1 code implementation • 29 May 2023 • Luca Arnaboldi, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

These insights are grounded in the reduction of SGD dynamics to a stochastic process in lower dimensions, where escaping mediocrity equates to calculating an exit time.

Paper
Code

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

1 code implementation • 29 May 2023 • Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

The picture drastically improves over multiple gradient steps: we show that a batch-size of $n = \mathcal{O}(d)$ is indeed enough to learn multiple target directions satisfying a staircase property, where more and more directions can be learned over time.

Paper
Code

Expectation consistency for calibration of neural networks

2 code implementations • 5 Mar 2023 • Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Despite their incredible performance, it is well reported that deep neural networks tend to be overoptimistic about their prediction confidence.

Uncertainty Quantification

Paper
Code

Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

1 code implementation • 17 Feb 2023 • Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we ask ourselves the question: "when is a single Gaussian enough to characterize the error?".

Paper
Code

From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks

1 code implementation • 12 Feb 2023 • Luca Arnaboldi, Ludovic Stephan, Florent Krzakala, Bruno Loureiro

This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a two-layer neural network trained on Gaussian data and labels generated by a similar, though not necessarily identical, target function.

Paper
Code

Deterministic equivalent and error universality of deep random features learning

1 code implementation • 1 Feb 2023 • Dominik Schröder, Hugo Cui, Daniil Dmitriev, Bruno Loureiro

Establishing this result requires proving a deterministic equivalent for traces of the deep random features sample covariance matrices which can be of independent interest.

Paper
Code

On double-descent in uncertainty quantification in overparametrized models

1 code implementation • 23 Oct 2022 • Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Uncertainty quantification is a central challenge in reliable and trustworthy machine learning.

Binary Classification Uncertainty Quantification

Paper
Code

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

1 code implementation • 26 May 2022 • Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors.

Clustering Vocal Bursts Intensity Prediction

Paper
Code

Gaussian Universality of Perceptrons with Random Labels

2 code implementations • 26 May 2022 • Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová

We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance.

Paper
Code

Learning curves for the multi-class teacher-student perceptron

1 code implementation • 22 Mar 2022 • Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

For Gaussian teacher weights, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality.

Binary Classification Learning Theory +1

Paper
Code

Theoretical characterization of uncertainty in high-dimensional linear classification

1 code implementation • 7 Feb 2022 • Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this manuscript, we characterise uncertainty for learning from limited number of samples of high-dimensional Gaussian input data and labels generated by the probit model.

Classification Vocal Bursts Intensity Prediction

Paper
Code

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

2 code implementations • 1 Feb 2022 • Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent.

Paper
Code

Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

no code implementations • 31 Jan 2022 • Bruno Loureiro, Cédric Gerbelot, Maria Refinetti, Gabriele Sicuro, Florent Krzakala

From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice.

Paper
Add Code

Error Scaling Laws for Kernel Classification under Source and Capacity Conditions

no code implementations • 29 Jan 2022 • Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We find that our rates tightly describe the learning curves for this class of data sets, and are also observed on real data.

Classification

Paper
Add Code

Bayesian Inference with Nonlinear Generative Models: Comments on Secure Learning

no code implementations • 19 Jan 2022 • Ali Bereyhi, Bruno Loureiro, Florent Krzakala, Ralf R. Müller, Hermann Schulz-Baldes

Unlike the classical linear model, nonlinear generative models have been addressed sparsely in the literature of statistical learning.

Bayesian Inference

Paper
Add Code

Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymptotics in High-dimensions

no code implementations • NeurIPS 2021 • Bruno Loureiro, Gabriele Sicuro, Cedric Gerbelot, Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.

Classification Multi-class Classification

Paper
Add Code

Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

2 code implementations • 7 Jun 2021 • Bruno Loureiro, Gabriele Sicuro, Cédric Gerbelot, Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.

Classification Multi-class Classification +1

Paper
Code

Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

no code implementations • NeurIPS 2021 • Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this work, we unify and extend this line of work, providing characterization of all regimes and excess error decay rates that can be observed in terms of the interplay of noise and regularization.

regression

Paper
Add Code

Learning curves of generic features maps for realistic datasets with a teacher-student model

1 code implementation • NeurIPS 2021 • Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework.

Paper
Code

The Gaussian equivalence of generative models for learning with shallow neural networks

1 code implementation • 25 Jun 2020 • Sebastian Goldt, Bruno Loureiro, Galen Reeves, Florent Krzakala, Marc Mézard, Lenka Zdeborová

Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models.

BIG-bench Machine Learning

Paper
Code

Phase retrieval in high dimensions: Statistical and computational phase transitions

1 code implementation • NeurIPS 2020 • Antoine Maillard, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $\mathbf{X}^{\star}$ from $m$ (possibly noisy) observations $Y_\mu = | \sum_{i=1}^n \Phi_{\mu i} X^{\star}_i/\sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $\mathbf{\Phi}$, in a high-dimensional setting where $m, n\to\infty$ while $\alpha = m/n=\Theta(1)$.

Retrieval Vocal Bursts Intensity Prediction

Paper
Code

Generalisation error in learning with random features and the hidden manifold model

no code implementations • ICML 2020 • Federica Gerace, Bruno Loureiro, Florent Krzakala, Marc Mézard, Lenka Zdeborová

In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model.

regression valid

Paper
Add Code

Exact asymptotics for phase retrieval and compressed sensing with random generative priors

no code implementations • 4 Dec 2019 • Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová

We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix.

Retrieval

Paper
Add Code

Precise asymptotics for phase retrieval and compressed sensing with random generative priors

no code implementations • NeurIPS Workshop Deep_Invers 2019 • Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborova

We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix.

Retrieval

Paper
Add Code

The spiked matrix model with generative priors

2 code implementations • NeurIPS 2019 • Benjamin Aubin, Bruno Loureiro, Antoine Maillard, Florent Krzakala, Lenka Zdeborová

Here, we replace the sparsity assumption by generative modelling, and investigate the consequences on statistical and algorithmic properties.

Dimensionality Reduction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.