Search Results for author: Ludovic Stephan

Found 8 papers, 8 papers with code

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

1 code implementation29 May 2023 Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

The picture drastically improves over multiple gradient steps: we show that a batch-size of $n = \mathcal{O}(d)$ is indeed enough to learn multiple target directions satisfying a staircase property, where more and more directions can be learned over time.

Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD

1 code implementation29 May 2023 Luca Arnaboldi, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

These insights are grounded in the reduction of SGD dynamics to a stochastic process in lower dimensions, where escaping mediocrity equates to calculating an exit time.

Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

1 code implementation17 Feb 2023 Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we ask ourselves the question: "when is a single Gaussian enough to characterize the error?".

From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks

1 code implementation12 Feb 2023 Luca Arnaboldi, Ludovic Stephan, Florent Krzakala, Bruno Loureiro

This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a two-layer neural network trained on Gaussian data and labels generated by a similar, though not necessarily identical, target function.

Gaussian Universality of Perceptrons with Random Labels

2 code implementations26 May 2022 Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová

We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance.

Sparse random hypergraphs: Non-backtracking spectra and community detection

1 code implementation14 Mar 2022 Ludovic Stephan, Yizhe Zhu

We consider the community detection problem in a sparse $q$-uniform hypergraph $G$, assuming that $G$ is generated according to the Hypergraph Stochastic Block Model (HSBM).

Community Detection Dimensionality Reduction +1

Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

2 code implementations1 Feb 2022 Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent.

Cannot find the paper you are looking for? You can Submit a new open access paper.