1 code implementation • 29 May 2023 • Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan
The picture drastically improves over multiple gradient steps: we show that a batch-size of $n = \mathcal{O}(d)$ is indeed enough to learn multiple target directions satisfying a staircase property, where more and more directions can be learned over time.
1 code implementation • 29 May 2023 • Luca Arnaboldi, Florent Krzakala, Bruno Loureiro, Ludovic Stephan
These insights are grounded in the reduction of SGD dynamics to a stochastic process in lower dimensions, where escaping mediocrity equates to calculating an exit time.
1 code implementation • 17 Feb 2023 • Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan
Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we ask ourselves the question: "when is a single Gaussian enough to characterize the error?".
1 code implementation • 12 Feb 2023 • Luca Arnaboldi, Ludovic Stephan, Florent Krzakala, Bruno Loureiro
This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a two-layer neural network trained on Gaussian data and labels generated by a similar, though not necessarily identical, target function.
2 code implementations • 26 May 2022 • Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová
We argue that there is a large universality class of high-dimensional input data for which we obtain the same minimum training loss as for Gaussian data with corresponding data covariance.
1 code implementation • 14 Mar 2022 • Ludovic Stephan, Yizhe Zhu
We consider the community detection problem in a sparse $q$-uniform hypergraph $G$, assuming that $G$ is generated according to the Hypergraph Stochastic Block Model (HSBM).
2 code implementations • 1 Feb 2022 • Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová
Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent.
1 code implementation • 5 Feb 2021 • Simon Coste, Ludovic Stephan
We study the task of clustering in directed networks.