Search Results for author: Lenka Zdeborová

Found 79 papers, 36 papers with code

Phase transition in the detection of modules in sparse networks

no code implementations6 Feb 2011 Aurelien Decelle, Florent Krzakala, Cristopher Moore, Lenka Zdeborová

We present an asymptotically exact analysis of the problem of detecting communities in sparse random networks.

Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications

no code implementations14 Sep 2011 Aurelien Decelle, Florent Krzakala, Cristopher Moore, Lenka Zdeborová

In this paper we extend our previous work on the stochastic block model, a commonly used generative model for social and biological networks, and the problem of inferring functional groups or communities from the topology of the network.

Statistical Mechanics Disordered Systems and Neural Networks Social and Information Networks Physics and Society

Statistical physics-based reconstruction in compressed sensing

1 code implementation20 Sep 2011 Florent Krzakala, Marc Mézard, François Sausset, Yifan Sun, Lenka Zdeborová

Compressed sensing is triggering a major evolution in signal acquisition.

Statistical Mechanics Information Theory Information Theory

Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices

1 code implementation18 Jun 2012 Florent Krzakala, Marc Mézard, François Sausset, Yifan Sun, Lenka Zdeborová

We further develop the asymptotic analysis of the corresponding phase diagrams with and without measurement noise, for different distribution of signals, and discuss the best possible reconstruction performances regardless of the algorithm.

Statistical Mechanics Information Theory Information Theory

Blind Calibration in Compressed Sensing using Message Passing Algorithms

no code implementations NeurIPS 2013 Christophe Schulke, Francesco Caltagirone, Florent Krzakala, Lenka Zdeborová

We study numerically the phase diagram of the blind calibration problem, and show that even in cases where convex relaxation is possible, our algorithm requires a smaller number of measurements and/or signals in order to perform well.

Phase transitions and sample complexity in Bayes-optimal matrix factorization

no code implementations6 Feb 2014 Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard, Ayaka Sakata, Lenka Zdeborová

We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm.

blind source separation Dictionary Learning +2

Phase transitions in semisupervised clustering of sparse networks

no code implementations30 Apr 2014 Pan Zhang, Cristopher Moore, Lenka Zdeborová

For larger $k$ where a hard but detectable regime exists, we find that the easy/hard transition (the point at which efficient algorithms can do better than chance) becomes a line of transitions where the accuracy jumps discontinuously at a critical value of $\alpha$.

Clustering Stochastic Block Model

Spectral Clustering of Graphs with the Bethe Hessian

3 code implementations NeurIPS 2014 Alaa Saade, Florent Krzakala, Lenka Zdeborová

We show that this approach combines the performances of the non-backtracking operator, thus detecting clusters all the way down to the theoretical limit in the stochastic block model, with the computational, theoretical and memory advantages of real symmetric matrices.

Clustering Stochastic Block Model

Sparse Estimation with the Swept Approximated Message-Passing Algorithm

1 code implementation17 Jun 2014 Andre Manoel, Florent Krzakala, Eric W. Tramel, Lenka Zdeborová

Approximate Message Passing (AMP) has been shown to be a superior method for inference problems, such as the recovery of signals from sets of noisy, lower-dimensionality measurements, both in terms of reconstruction accuracy and in computational efficiency.

Computational Efficiency

Spectral Detection in the Censored Block Model

no code implementations31 Jan 2015 Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators.

Clustering Community Detection

MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel

1 code implementation14 Jul 2015 Thibault Lesieur, Florent Krzakala, Lenka Zdeborová

This paper considers probabilistic estimation of a low-rank matrix from non-linear element-wise measurements of its elements.

Stochastic Block Model

Statistical physics of inference: Thresholds and algorithms

1 code implementation8 Nov 2015 Lenka Zdeborová, Florent Krzakala

Many questions of fundamental interest in todays science can be formulated as inference problems: Some partial, or noisy, observations are performed over a set of variables and the goal is to recover, or infer, the values of the variables based on the indirect information contained in the measurements.

Community Detection

Clustering from Sparse Pairwise Measurements

no code implementations25 Jan 2016 Alaa Saade, Marc Lelarge, Florent Krzakala, Lenka Zdeborová

We consider the problem of grouping items into clusters based on few random pairwise comparisons between the items.

Clustering

Fast Randomized Semi-Supervised Clustering

no code implementations20 May 2016 Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items.

Clustering General Classification

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

no code implementations10 Oct 2016 Thibault Lesieur, Caterina De Bacco, Jess Banks, Florent Krzakala, Cris Moore, Lenka Zdeborová

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n, m \rightarrow \infty$ and $\alpha = m/n$ stays finite.

Clustering Vocal Bursts Intensity Prediction

Multi-Layer Generalized Linear Estimation

no code implementations24 Jan 2017 Andre Manoel, Florent Krzakala, Marc Mézard, Lenka Zdeborová

We consider the problem of reconstructing a signal from multi-layered (possibly) non-linear measurements.

Streaming Bayesian inference: theoretical limits and mini-batch approximate message-passing

no code implementations2 Jun 2017 Andre Manoel, Florent Krzakala, Eric W. Tramel, Lenka Zdeborová

In statistical learning for real-world large-scale data problems, one must often resort to "streaming" algorithms which operate sequentially on small batches of data.

Bayesian Inference Clustering

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

1 code implementation10 Aug 2017 Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, Lenka Zdeborová

Non-rigorous predictions for the optimal errors existed for special cases of GLMs, e. g. for the perceptron, in the field of statistical physics based on the so-called replica method.

Vocal Bursts Intensity Prediction

Dense Limit of the Dawid-Skene Model for Crowdsourcing and Regions of Sub-optimality of Message Passing Algorithms

no code implementations13 Mar 2018 Christian Schmidt, Lenka Zdeborová

We further study numerically the performance of approximate message passing, derived in the dense limit, on sparse instances and carry out experiments on a real world dataset.

Glassy nature of the hard phase in inference problems

no code implementations15 May 2018 Fabrizio Antenucci, Silvio Franz, Pierfrancesco Urbani, Lenka Zdeborová

An algorithmically hard phase was described in a range of inference problems: even if the signal can be reconstructed with a small error from an information theoretic point of view, known algorithms fail unless the noise-to-signal ratio is sufficiently small.

The committee machine: Computational to statistical gaps in learning a two-layers neural network

1 code implementation NeurIPS 2018 Benjamin Aubin, Antoine Maillard, Jean Barbier, Florent Krzakala, Nicolas Macris, Lenka Zdeborová

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks.

Approximate Survey Propagation for Statistical Inference

no code implementations3 Jul 2018 Fabrizio Antenucci, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Approximate message passing algorithm enjoyed considerable attention in the last decade.

Approximate message-passing for convex optimization with non-separable penalties

2 code implementations17 Sep 2018 Andre Manoel, Florent Krzakala, Gaël Varoquaux, Bertrand Thirion, Lenka Zdeborová

We introduce an iterative optimization scheme for convex objectives consisting of a linear loss and a non-separable penalty, based on the expectation-consistent approximation and the vector approximate message-passing (VAMP) algorithm.

Rank-one matrix estimation: analysis of algorithmic and information theoretic limits by the spatial coupling method

no code implementations6 Dec 2018 Jean Barbier, Mohamad Dia, Nicolas Macris, Florent Krzakala, Lenka Zdeborová

We characterize the detectability phase transitions in a large set of estimation problems, where we show that there exists a gap between what currently known polynomial algorithms (in particular spectral methods and approximate message-passing) can do and what is expected information theoretically.

Community Detection Compressive Sensing

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

no code implementations21 Dec 2018 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference.

Generalisation dynamics of online learning in over-parameterised neural networks

no code implementations25 Jan 2019 Sebastian Goldt, Madhu S. Advani, Andrew M. Saxe, Florent Krzakala, Lenka Zdeborová

Deep neural networks achieve stellar generalisation on a variety of problems, despite often being large enough to easily fit all their training data.

Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models

no code implementations1 Feb 2019 Stefano Sarao Mannelli, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model.

Machine learning and the physical sciences

1 code implementation25 Mar 2019 Giuseppe Carleo, Ignacio Cirac, Kyle Cranmer, Laurent Daudet, Maria Schuld, Naftali Tishby, Leslie Vogt-Maranto, Lenka Zdeborová

Machine learning encompasses a broad range of algorithms and modeling tools used for a vast array of data processing tasks, which has entered most scientific disciplines in recent years.

Computational Physics Cosmology and Nongalactic Astrophysics Disordered Systems and Neural Networks High Energy Physics - Theory Quantum Physics

The spiked matrix model with generative priors

2 code implementations NeurIPS 2019 Benjamin Aubin, Bruno Loureiro, Antoine Maillard, Florent Krzakala, Lenka Zdeborová

Here, we replace the sparsity assumption by generative modelling, and investigate the consequences on statistical and algorithmic properties.

Dimensionality Reduction

On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix

1 code implementation11 Jun 2019 Alia Abbara, Antoine Baker, Florent Krzakala, Lenka Zdeborová

In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*.

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

no code implementations18 Jul 2019 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones.

Multilayer Modularity Belief Propagation To Assess Detectability Of Community Structure

1 code implementation13 Aug 2019 William H. Weir, Benjamin Walker, Lenka Zdeborová, Peter J. Mucha

We compare our approach with a widely used community detection tool, GenLouvain, across a range of synthetic, multilayer benchmark networks, demonstrating that our method performs comparably to the state of the art.

Social and Information Networks Data Analysis, Statistics and Probability Physics and Society

Modelling the influence of data structure on learning in neural networks: the hidden manifold model

1 code implementation25 Sep 2019 Sebastian Goldt, Marc Mézard, Florent Krzakala, Lenka Zdeborová

We demonstrate that learning of the hidden manifold model is amenable to an analytical treatment by proving a "Gaussian Equivalence Property" (GEP), and we use the GEP to show how the dynamics of two-layer neural networks trained using one-pass stochastic gradient descent is captured by a set of integro-differential equations that track the performance of the network at all times.

Generative Adversarial Network

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

1 code implementation NeurIPS 2019 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics.

Exact asymptotics for phase retrieval and compressed sensing with random generative priors

no code implementations4 Dec 2019 Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová

We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix.

Retrieval

Rademacher complexity and spin glasses: A link between the replica and statistical theories of learning

no code implementations5 Dec 2019 Alia Abbara, Benjamin Aubin, Florent Krzakala, Lenka Zdeborová

Statistical learning theory provides bounds of the generalization gap, using in particular the Vapnik-Chervonenkis dimension and the Rademacher complexity.

Learning Theory

Large deviations for the perceptron model and consequences for active learning

no code implementations9 Dec 2019 Hugo Cui, Luca Saglietti, Lenka Zdeborová

These large deviations then provide optimal achievable performance boundaries for any active learning algorithm.

Active Learning

Generalisation error in learning with random features and the hidden manifold model

no code implementations ICML 2020 Federica Gerace, Bruno Loureiro, Florent Krzakala, Marc Mézard, Lenka Zdeborová

In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model.

regression valid

Tree-AMP: Compositional Inference with Tree Approximate Message Passing

1 code implementation3 Apr 2020 Antoine Baker, Benjamin Aubin, Florent Krzakala, Lenka Zdeborová

We introduce Tree-AMP, standing for Tree Approximate Message Passing, a python package for compositional inference in high-dimensional tree-structured models.

Phase retrieval in high dimensions: Statistical and computational phase transitions

1 code implementation NeurIPS 2020 Antoine Maillard, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $\mathbf{X}^{\star}$ from $m$ (possibly noisy) observations $Y_\mu = | \sum_{i=1}^n \Phi_{\mu i} X^{\star}_i/\sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $\mathbf{\Phi}$, in a high-dimensional setting where $m, n\to\infty$ while $\alpha = m/n=\Theta(1)$.

Retrieval Vocal Bursts Intensity Prediction

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

no code implementations NeurIPS 2020 Benjamin Aubin, Florent Krzakala, Yue M. Lu, Lenka Zdeborová

We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random iid inputs.

regression

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

no code implementations NeurIPS 2020 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem.

Retrieval

The Gaussian equivalence of generative models for learning with shallow neural networks

1 code implementation25 Jun 2020 Sebastian Goldt, Bruno Loureiro, Galen Reeves, Florent Krzakala, Marc Mézard, Lenka Zdeborová

Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models.

BIG-bench Machine Learning

Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

no code implementations NeurIPS 2020 Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborová

We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width $m^*\le m$.

Solvable Model for Inheriting the Regularization through Knowledge Distillation

no code implementations1 Dec 2020 Luca Saglietti, Lenka Zdeborová

In recent years the empirical success of transfer learning with neural networks has stimulated an increasing interest in obtaining a theoretical understanding of its core properties.

Knowledge Distillation Transfer Learning

Construction of optimal spectral methods in phase retrieval

1 code implementation8 Dec 2020 Antoine Maillard, Florent Krzakala, Yue M. Lu, Lenka Zdeborová

We consider the phase retrieval problem, in which the observer wishes to recover a $n$-dimensional real or complex signal $\mathbf{X}^\star$ from the (possibly noisy) observation of $|\mathbf{\Phi} \mathbf{X}^\star|$, in which $\mathbf{\Phi}$ is a matrix of size $m \times n$.

Information Theory Disordered Systems and Neural Networks Information Theory

Learning curves of generic features maps for realistic datasets with a teacher-student model

1 code implementation NeurIPS 2021 Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework.

Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

1 code implementation23 Feb 2021 Maria Refinetti, Sebastian Goldt, Florent Krzakala, Lenka Zdeborová

Here, we show theoretically that two-layer neural networks (2LNN) with only a few hidden neurons can beat the performance of kernel learning on a simple Gaussian mixture classification task.

Image Classification

Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

no code implementations8 Mar 2021 Francesca Mignacco, Pierfrancesco Urbani, Lenka Zdeborová

In this paper we investigate how gradient-based algorithms such as gradient descent, (multi-pass) stochastic gradient descent, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity.

Navigate Retrieval

Bayesian reconstruction of memories stored in neural networks from their connectivity

1 code implementation16 May 2021 Sebastian Goldt, Florent Krzakala, Lenka Zdeborová, Nicolas Brunel

The advent of comprehensive synaptic wiring diagrams of large neural circuits has created the field of connectomics and given rise to a number of open research questions.

Bayesian Inference

Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

no code implementations NeurIPS 2021 Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this work, we unify and extend this line of work, providing characterization of all regimes and excess error decay rates that can be observed in terms of the interplay of noise and regularization.

regression

Probing transfer learning with a model of synthetic correlated datasets

no code implementations9 Jun 2021 Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe, Lenka Zdeborová

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task.

Binary Classification Transfer Learning

Error Scaling Laws for Kernel Classification under Source and Capacity Conditions

no code implementations29 Jan 2022 Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We find that our rates tightly describe the learning curves for this class of data sets, and are also observed on real data.

Classification

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

2 code implementations1 Feb 2022 Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent.

Theoretical characterization of uncertainty in high-dimensional linear classification

1 code implementation7 Feb 2022 Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this manuscript, we characterise uncertainty for learning from limited number of samples of high-dimensional Gaussian input data and labels generated by the probit model.

Classification Vocal Bursts Intensity Prediction

Learning curves for the multi-class teacher-student perceptron

1 code implementation22 Mar 2022 Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

For Gaussian teacher weights, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality.

Binary Classification Learning Theory +1

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

1 code implementation26 May 2022 Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors.

Clustering Vocal Bursts Intensity Prediction

Gaussian Universality of Perceptrons with Random Labels

2 code implementations26 May 2022 Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová

We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance.

The planted XY model: thermodynamics and inference

no code implementations12 Aug 2022 Siyu Chen, Guanhao Huang, Giovanni Piccioli, Lenka Zdeborová

We derive the replica symmetric (RS) phase diagram in the temperature, ferromagnetic bias plane using the approximate message passing (AMP) algorithm and its state evolution (SE).

Bayes-optimal Learning of Deep Random Networks of Extensive-width

no code implementations1 Feb 2023 Hugo Cui, Florent Krzakala, Lenka Zdeborová

We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights.

regression

Expectation consistency for calibration of neural networks

2 code implementations5 Mar 2023 Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Despite their incredible performance, it is well reported that deep neural networks tend to be overoptimistic about their prediction confidence.

Uncertainty Quantification

Gibbs Sampling the Posterior of Neural Networks

2 code implementations5 Jun 2023 Giovanni Piccioli, Emanuele Troiani, Lenka Zdeborová

In this paper, we study sampling from a posterior derived from a neural network.

Sampling with flows, diffusion and autoregressive neural networks: A spin-glass perspective

1 code implementation27 Aug 2023 Davide Ghio, Yatin Dandi, Florent Krzakala, Lenka Zdeborová

Recent years witnessed the development of powerful generative models based on flows, diffusion or autoregressive neural networks, achieving remarkable success in generating data from examples with applications in a broad range of areas.

Denoising

Analysis of learning a flow-based generative model from limited sample complexity

1 code implementation5 Oct 2023 Hugo Cui, Florent Krzakala, Eric Vanden-Eijnden, Lenka Zdeborová

We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture.

Denoising

The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents

no code implementations5 Feb 2024 Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborová, Florent Krzakala

In particular, multi-pass GD with finite stepsize is found to overcome the limitations of gradient flow and single-pass GD given by the information exponent (Ben Arous et al., 2021) and leap exponent (Abbe et al., 2023) of the target function.

A phase transition between positional and semantic learning in a solvable model of dot-product attention

no code implementations6 Feb 2024 Hugo Cui, Freya Behrens, Florent Krzakala, Lenka Zdeborová

We investigate how a dot-product attention layer learns a positional attention matrix (with tokens attending to each other based on their respective positions) and a semantic attention matrix (with tokens attending to each other based on their meaning).

Asymptotics of feature learning in two-layer networks after one gradient-step

1 code implementation7 Feb 2024 Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

To our knowledge, our results provides the first tight description of the impact of feature learning in the generalization of two-layer neural networks in the large learning rate regime $\eta=\Theta_{d}(d)$, beyond perturbative finite width corrections of the conjugate and neural tangent kernels.

Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression

no code implementations21 Feb 2024 Lucas Clarté, Adrien Vandenbroucque, Guillaume Dalle, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks.

regression

Fundamental limits of Non-Linear Low-Rank Matrix Estimation

no code implementations7 Mar 2024 Pierre Mergny, Justin Ko, Florent Krzakala, Lenka Zdeborová

We consider the task of estimating a low-rank matrix from non-linear and noisy observations.

Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.