Search Results for author: Lenka Zdeborová

Found 67 papers, 29 papers with code

Gaussian Universality of Linear Classifiers with Random Labels in High-Dimension

1 code implementation26 May 2022 Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová

Our main contribution is a rigorous proof that data coming from a range of generative models in high-dimensions have the same minimum training loss as Gaussian data with corresponding data covariance.

Subspace clustering in high-dimensions: Phase transitions \& Statistical-to-Computational gap

1 code implementation26 May 2022 Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors.

Learning curves for the multi-class teacher-student perceptron

1 code implementation22 Mar 2022 Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

For Gaussian teacher weights, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality.

Learning Theory Multi-class Classification

Theoretical characterization of uncertainty in high-dimensional linear classification

1 code implementation7 Feb 2022 Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this manuscript, we characterise uncertainty for learning from limited number of samples of high-dimensional Gaussian input data and labels generated by the probit model.

Classification

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

1 code implementation1 Feb 2022 Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent.

Error Rates for Kernel Classification under Source and Capacity Conditions

no code implementations29 Jan 2022 Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We derive the rates as a function of the source and capacity coefficients for two standard kernel classification settings, namely margin-maximizing Support Vector Machines (SVM) and ridge classification, and contrast the two methods.

Classification

Probing transfer learning with a model of synthetic correlated datasets

no code implementations9 Jun 2021 Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe, Lenka Zdeborová

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task.

Transfer Learning

Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

no code implementations NeurIPS 2021 Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this work, we unify and extend this line of work, providing characterization of all regimes and excess error decay rates that can be observed in terms of the interplay of noise and regularization.

Bayesian reconstruction of memories stored in neural networks from their connectivity

1 code implementation16 May 2021 Sebastian Goldt, Florent Krzakala, Lenka Zdeborová, Nicolas Brunel

The advent of comprehensive synaptic wiring diagrams of large neural circuits has created the field of connectomics and given rise to a number of open research questions.

Bayesian Inference

Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

no code implementations8 Mar 2021 Francesca Mignacco, Pierfrancesco Urbani, Lenka Zdeborová

In this paper we investigate how gradient-based algorithms such as gradient descent, (multi-pass) stochastic gradient descent, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity.

Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed

1 code implementation23 Feb 2021 Maria Refinetti, Sebastian Goldt, Florent Krzakala, Lenka Zdeborová

Here, we show theoretically that two-layer neural networks (2LNN) with only a few hidden neurons can beat the performance of kernel learning on a simple Gaussian mixture classification task.

Image Classification

Learning curves of generic features maps for realistic datasets with a teacher-student model

1 code implementation NeurIPS 2021 Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework.

Construction of optimal spectral methods in phase retrieval

1 code implementation8 Dec 2020 Antoine Maillard, Florent Krzakala, Yue M. Lu, Lenka Zdeborová

We consider the phase retrieval problem, in which the observer wishes to recover a $n$-dimensional real or complex signal $\mathbf{X}^\star$ from the (possibly noisy) observation of $|\mathbf{\Phi} \mathbf{X}^\star|$, in which $\mathbf{\Phi}$ is a matrix of size $m \times n$.

Information Theory Disordered Systems and Neural Networks Information Theory

Solvable Model for Inheriting the Regularization through Knowledge Distillation

no code implementations1 Dec 2020 Luca Saglietti, Lenka Zdeborová

In recent years the empirical success of transfer learning with neural networks has stimulated an increasing interest in obtaining a theoretical understanding of its core properties.

Knowledge Distillation Transfer Learning

Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

no code implementations NeurIPS 2020 Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborová

We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width $m^*\le m$.

The Gaussian equivalence of generative models for learning with shallow neural networks

1 code implementation25 Jun 2020 Sebastian Goldt, Bruno Loureiro, Galen Reeves, Florent Krzakala, Marc Mézard, Lenka Zdeborová

Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models.

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

no code implementations NeurIPS 2020 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem.

Generalization error in high-dimensional perceptrons: Approaching Bayes error with convex optimization

no code implementations NeurIPS 2020 Benjamin Aubin, Florent Krzakala, Yue M. Lu, Lenka Zdeborová

We consider a commonly studied supervised classification of a synthetic dataset whose labels are generated by feeding a one-layer neural network with random iid inputs.

Phase retrieval in high dimensions: Statistical and computational phase transitions

1 code implementation NeurIPS 2020 Antoine Maillard, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $\mathbf{X}^{\star}$ from $m$ (possibly noisy) observations $Y_\mu = | \sum_{i=1}^n \Phi_{\mu i} X^{\star}_i/\sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $\mathbf{\Phi}$, in a high-dimensional setting where $m, n\to\infty$ while $\alpha = m/n=\Theta(1)$.

Tree-AMP: Compositional Inference with Tree Approximate Message Passing

1 code implementation3 Apr 2020 Antoine Baker, Benjamin Aubin, Florent Krzakala, Lenka Zdeborová

We introduce Tree-AMP, standing for Tree Approximate Message Passing, a python package for compositional inference in high-dimensional tree-structured models.

Generalisation error in learning with random features and the hidden manifold model

no code implementations ICML 2020 Federica Gerace, Bruno Loureiro, Florent Krzakala, Marc Mézard, Lenka Zdeborová

In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model.

Large deviations for the perceptron model and consequences for active learning

no code implementations9 Dec 2019 Hugo Cui, Luca Saglietti, Lenka Zdeborová

These large deviations then provide optimal achievable performance boundaries for any active learning algorithm.

Active Learning

Rademacher complexity and spin glasses: A link between the replica and statistical theories of learning

no code implementations5 Dec 2019 Alia Abbara, Benjamin Aubin, Florent Krzakala, Lenka Zdeborová

Statistical learning theory provides bounds of the generalization gap, using in particular the Vapnik-Chervonenkis dimension and the Rademacher complexity.

Learning Theory

Exact asymptotics for phase retrieval and compressed sensing with random generative priors

no code implementations4 Dec 2019 Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová

We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix.

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

1 code implementation NeurIPS 2019 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics.

Modelling the influence of data structure on learning in neural networks: the hidden manifold model

1 code implementation25 Sep 2019 Sebastian Goldt, Marc Mézard, Florent Krzakala, Lenka Zdeborová

We demonstrate that learning of the hidden manifold model is amenable to an analytical treatment by proving a "Gaussian Equivalence Property" (GEP), and we use the GEP to show how the dynamics of two-layer neural networks trained using one-pass stochastic gradient descent is captured by a set of integro-differential equations that track the performance of the network at all times.

Multilayer Modularity Belief Propagation To Assess Detectability Of Community Structure

1 code implementation13 Aug 2019 William H. Weir, Benjamin Walker, Lenka Zdeborová, Peter J. Mucha

We compare our approach with a widely used community detection tool, GenLouvain, across a range of synthetic, multilayer benchmark networks, demonstrating that our method performs comparably to the state of the art.

Social and Information Networks Data Analysis, Statistics and Probability Physics and Society

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

no code implementations18 Jul 2019 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones.

On the Universality of Noiseless Linear Estimation with Respect to the Measurement Matrix

1 code implementation11 Jun 2019 Alia Abbara, Antoine Baker, Florent Krzakala, Lenka Zdeborová

In a noiseless linear estimation problem, one aims to reconstruct a vector x* from the knowledge of its linear projections y=Phi x*.

The spiked matrix model with generative priors

2 code implementations NeurIPS 2019 Benjamin Aubin, Bruno Loureiro, Antoine Maillard, Florent Krzakala, Lenka Zdeborová

Here, we replace the sparsity assumption by generative modelling, and investigate the consequences on statistical and algorithmic properties.

Dimensionality Reduction

Machine learning and the physical sciences

1 code implementation25 Mar 2019 Giuseppe Carleo, Ignacio Cirac, Kyle Cranmer, Laurent Daudet, Maria Schuld, Naftali Tishby, Leslie Vogt-Maranto, Lenka Zdeborová

Machine learning encompasses a broad range of algorithms and modeling tools used for a vast array of data processing tasks, which has entered most scientific disciplines in recent years.

Computational Physics Cosmology and Nongalactic Astrophysics Disordered Systems and Neural Networks High Energy Physics - Theory Quantum Physics

Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models

no code implementations1 Feb 2019 Stefano Sarao Mannelli, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model.

Generalisation dynamics of online learning in over-parameterised neural networks

no code implementations25 Jan 2019 Sebastian Goldt, Madhu S. Advani, Andrew M. Saxe, Florent Krzakala, Lenka Zdeborová

Deep neural networks achieve stellar generalisation on a variety of problems, despite often being large enough to easily fit all their training data.

online learning

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

no code implementations21 Dec 2018 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference.

Rank-one matrix estimation: analysis of algorithmic and information theoretic limits by the spatial coupling method

no code implementations6 Dec 2018 Jean Barbier, Mohamad Dia, Nicolas Macris, Florent Krzakala, Lenka Zdeborová

We characterize the detectability phase transitions in a large set of estimation problems, where we show that there exists a gap between what currently known polynomial algorithms (in particular spectral methods and approximate message-passing) can do and what is expected information theoretically.

Community Detection Compressive Sensing

Approximate message-passing for convex optimization with non-separable penalties

no code implementations17 Sep 2018 Andre Manoel, Florent Krzakala, Gaël Varoquaux, Bertrand Thirion, Lenka Zdeborová

We introduce an iterative optimization scheme for convex objectives consisting of a linear loss and a non-separable penalty, based on the expectation-consistent approximation and the vector approximate message-passing (VAMP) algorithm.

Approximate Survey Propagation for Statistical Inference

no code implementations3 Jul 2018 Fabrizio Antenucci, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Approximate message passing algorithm enjoyed considerable attention in the last decade.

The committee machine: Computational to statistical gaps in learning a two-layers neural network

1 code implementation NeurIPS 2018 Benjamin Aubin, Antoine Maillard, Jean Barbier, Florent Krzakala, Nicolas Macris, Lenka Zdeborová

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks.

Glassy nature of the hard phase in inference problems

no code implementations15 May 2018 Fabrizio Antenucci, Silvio Franz, Pierfrancesco Urbani, Lenka Zdeborová

An algorithmically hard phase was described in a range of inference problems: even if the signal can be reconstructed with a small error from an information theoretic point of view, known algorithms fail unless the noise-to-signal ratio is sufficiently small.

Dense Limit of the Dawid-Skene Model for Crowdsourcing and Regions of Sub-optimality of Message Passing Algorithms

no code implementations13 Mar 2018 Christian Schmidt, Lenka Zdeborová

We further study numerically the performance of approximate message passing, derived in the dense limit, on sparse instances and carry out experiments on a real world dataset.

Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models

1 code implementation10 Aug 2017 Jean Barbier, Florent Krzakala, Nicolas Macris, Léo Miolane, Lenka Zdeborová

Non-rigorous predictions for the optimal errors existed for special cases of GLMs, e. g. for the perceptron, in the field of statistical physics based on the so-called replica method.

Streaming Bayesian inference: theoretical limits and mini-batch approximate message-passing

no code implementations2 Jun 2017 Andre Manoel, Florent Krzakala, Eric W. Tramel, Lenka Zdeborová

In statistical learning for real-world large-scale data problems, one must often resort to "streaming" algorithms which operate sequentially on small batches of data.

Bayesian Inference

Multi-Layer Generalized Linear Estimation

no code implementations24 Jan 2017 Andre Manoel, Florent Krzakala, Marc Mézard, Lenka Zdeborová

We consider the problem of reconstructing a signal from multi-layered (possibly) non-linear measurements.

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

no code implementations10 Oct 2016 Thibault Lesieur, Caterina De Bacco, Jess Banks, Florent Krzakala, Cris Moore, Lenka Zdeborová

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n, m \rightarrow \infty$ and $\alpha = m/n$ stays finite.

Fast Randomized Semi-Supervised Clustering

no code implementations20 May 2016 Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items.

General Classification

Clustering from Sparse Pairwise Measurements

no code implementations25 Jan 2016 Alaa Saade, Marc Lelarge, Florent Krzakala, Lenka Zdeborová

We consider the problem of grouping items into clusters based on few random pairwise comparisons between the items.

Statistical physics of inference: Thresholds and algorithms

1 code implementation8 Nov 2015 Lenka Zdeborová, Florent Krzakala

Many questions of fundamental interest in todays science can be formulated as inference problems: Some partial, or noisy, observations are performed over a set of variables and the goal is to recover, or infer, the values of the variables based on the indirect information contained in the measurements.

Community Detection

MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel

1 code implementation14 Jul 2015 Thibault Lesieur, Florent Krzakala, Lenka Zdeborová

This paper considers probabilistic estimation of a low-rank matrix from non-linear element-wise measurements of its elements.

Stochastic Block Model

Spectral Detection in the Censored Block Model

no code implementations31 Jan 2015 Alaa Saade, Florent Krzakala, Marc Lelarge, Lenka Zdeborová

We describe two spectral algorithms for this task based on the non-backtracking and the Bethe Hessian operators.

Community Detection

Sparse Estimation with the Swept Approximated Message-Passing Algorithm

1 code implementation17 Jun 2014 Andre Manoel, Florent Krzakala, Eric W. Tramel, Lenka Zdeborová

Approximate Message Passing (AMP) has been shown to be a superior method for inference problems, such as the recovery of signals from sets of noisy, lower-dimensionality measurements, both in terms of reconstruction accuracy and in computational efficiency.

Spectral Clustering of Graphs with the Bethe Hessian

2 code implementations NeurIPS 2014 Alaa Saade, Florent Krzakala, Lenka Zdeborová

We show that this approach combines the performances of the non-backtracking operator, thus detecting clusters all the way down to the theoretical limit in the stochastic block model, with the computational, theoretical and memory advantages of real symmetric matrices.

Stochastic Block Model

Phase transitions in semisupervised clustering of sparse networks

no code implementations30 Apr 2014 Pan Zhang, Cristopher Moore, Lenka Zdeborová

For larger $k$ where a hard but detectable regime exists, we find that the easy/hard transition (the point at which efficient algorithms can do better than chance) becomes a line of transitions where the accuracy jumps discontinuously at a critical value of $\alpha$.

Stochastic Block Model

Phase transitions and sample complexity in Bayes-optimal matrix factorization

no code implementations6 Feb 2014 Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard, Ayaka Sakata, Lenka Zdeborová

We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm.

Dictionary Learning Low-Rank Matrix Completion +1

Blind Calibration in Compressed Sensing using Message Passing Algorithms

no code implementations NeurIPS 2013 Christophe Schulke, Francesco Caltagirone, Florent Krzakala, Lenka Zdeborová

We study numerically the phase diagram of the blind calibration problem, and show that even in cases where convex relaxation is possible, our algorithm requires a smaller number of measurements and/or signals in order to perform well.

Probabilistic Reconstruction in Compressed Sensing: Algorithms, Phase Diagrams, and Threshold Achieving Matrices

1 code implementation18 Jun 2012 Florent Krzakala, Marc Mézard, François Sausset, Yifan Sun, Lenka Zdeborová

We further develop the asymptotic analysis of the corresponding phase diagrams with and without measurement noise, for different distribution of signals, and discuss the best possible reconstruction performances regardless of the algorithm.

Statistical Mechanics Information Theory Information Theory

Statistical physics-based reconstruction in compressed sensing

1 code implementation20 Sep 2011 Florent Krzakala, Marc Mézard, François Sausset, Yifan Sun, Lenka Zdeborová

Compressed sensing is triggering a major evolution in signal acquisition.

Statistical Mechanics Information Theory Information Theory

Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications

no code implementations14 Sep 2011 Aurelien Decelle, Florent Krzakala, Cristopher Moore, Lenka Zdeborová

In this paper we extend our previous work on the stochastic block model, a commonly used generative model for social and biological networks, and the problem of inferring functional groups or communities from the topology of the network.

Statistical Mechanics Disordered Systems and Neural Networks Social and Information Networks Physics and Society

Phase transition in the detection of modules in sparse networks

no code implementations6 Feb 2011 Aurelien Decelle, Florent Krzakala, Cristopher Moore, Lenka Zdeborová

We present an asymptotically exact analysis of the problem of detecting communities in sparse random networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.