Search Results for author: Luca Pesce

Found 5 papers, 4 papers with code

Asymptotics of feature learning in two-layer networks after one gradient-step

1 code implementation7 Feb 2024 Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

To our knowledge, our results provides the first tight description of the impact of feature learning in the generalization of two-layer neural networks in the large learning rate regime $\eta=\Theta_{d}(d)$, beyond perturbative finite width corrections of the conjugate and neural tangent kernels.

The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents

no code implementations5 Feb 2024 Yatin Dandi, Emanuele Troiani, Luca Arnaboldi, Luca Pesce, Lenka Zdeborová, Florent Krzakala

In particular, multi-pass GD with finite stepsize is found to overcome the limitations of gradient flow and single-pass GD given by the information exponent (Ben Arous et al., 2021) and leap exponent (Abbe et al., 2023) of the target function.

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

1 code implementation29 May 2023 Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

The picture drastically improves over multiple gradient steps: we show that a batch-size of $n = \mathcal{O}(d)$ is indeed enough to learn multiple target directions satisfying a staircase property, where more and more directions can be learned over time.

Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

1 code implementation17 Feb 2023 Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we ask ourselves the question: "when is a single Gaussian enough to characterize the error?".

Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

1 code implementation26 May 2022 Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors.

Clustering Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.