no code implementations • 16 Apr 2024 • Umberto Tomasini, Matthieu Wyart
Understanding what makes high-dimensional data learnable is a fundamental question in machine learning.
1 code implementation • 26 Feb 2024 • Antonio Sclocchi, Alessandro Favero, Matthieu Wyart
We find that the backward diffusion process acting after a time $t$ is governed by a phase transition at some threshold time, where the probability of reconstructing high-level features, like the class of an image, suddenly drops.
1 code implementation • 19 Sep 2023 • Antonio Sclocchi, Matthieu Wyart
For small $B$ and large $\eta$, SGD corresponds to a stochastic evolution of the parameters, whose noise amplitude is governed by the ''temperature'' $T\equiv \eta/B$.
1 code implementation • 5 Jul 2023 • Francesco Cagnetta, Leonardo Petrini, Umberto M. Tomasini, Alessandro Favero, Matthieu Wyart
The model is a classification task where each class corresponds to a group of high-level features, chosen among several equivalent groups associated with the same class.
no code implementations • 31 Jan 2023 • Antonio Sclocchi, Mario Geiger, Matthieu Wyart
They show that SGD noise can be detrimental or instead useful depending on the training regime.
1 code implementation • 4 Oct 2022 • Umberto M. Tomasini, Leonardo Petrini, Francesco Cagnetta, Matthieu Wyart
Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by both spatial and channel pooling, (ii) introduce a model scale-detection task which reproduces our empirical observations on spatial pooling and (iii) compute analitically how the sensitivity to diffeomorphisms and noise scales with depth due to spatial pooling.
1 code implementation • 1 Aug 2022 • Francesco Cagnetta, Alessandro Favero, Matthieu Wyart
Interestingly, we find that, despite their hierarchical structure, the functions generated by infinitely-wide deep CNNs are too rich to be efficiently learnable in high dimension.
1 code implementation • 24 Jun 2022 • Leonardo Petrini, Francesco Cagnetta, Eric Vanden-Eijnden, Matthieu Wyart
It is widely believed that the success of deep networks lies in their ability to learn a meaningful representation of the features of the data.
no code implementations • 7 Feb 2022 • Umberto M. Tomasini, Antonio Sclocchi, Matthieu Wyart
Recently, several theories including the replica method made predictions for the generalization error of Kernel Ridge Regression.
no code implementations • NeurIPS 2021 • Alessandro Favero, Francesco Cagnetta, Matthieu Wyart
Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge.
1 code implementation • 16 Jun 2021 • Mario Geiger, Christophe Eloy, Matthieu Wyart
Reinforcement learning is generally difficult for partially observable Markov decision processes (POMDPs), which occurs when the agent's observation is partial or noisy.
2 code implementations • NeurIPS 2021 • Leonardo Petrini, Alessandro Favero, Mario Geiger, Matthieu Wyart
Understanding why deep nets can classify data in large dimensions remains a challenge.
no code implementations • 5 Jan 2021 • Hugo Perrin, Matthieu Wyart, Bloen Metzger, Yoël Forterre
The jamming transition is accompanied by a rich phenomenology, such as hysteresis or non-local effects, which is still not well understood.
Soft Condensed Matter
1 code implementation • 30 Dec 2020 • Mario Geiger, Leonardo Petrini, Matthieu Wyart
In this manuscript, we review recent results elucidating (i, ii) and the perspective they offer on the (still unexplained) curse of dimensionality paradox.
1 code implementation • 22 Jul 2020 • Jonas Paccolat, Leonardo Petrini, Mario Geiger, Kevin Tyloo, Matthieu Wyart
We confirm these predictions both for a one-hidden layer FC network trained on the stripe model and for a 16-layers CNN trained on MNIST, for which we also find $\beta_\text{Feature}>\beta_\text{Lazy}$.
no code implementations • 17 Jun 2020 • Jonas Paccolat, Stefano Spigler, Matthieu Wyart
(ii) Next we consider support-vector binary classification and introduce the stripe model where the data label depends on a single coordinate $y(\underline{x}) = y(x_1)$, corresponding to parallel decision boundaries separating labels of different signs, and consider that there is no margin at these interfaces.
no code implementations • 19 Jun 2019 • Mario Geiger, Stefano Spigler, Arthur Jacot, Matthieu Wyart
Two distinct limits for deep learning have been derived as the network width $h\rightarrow \infty$, depending on how the weights of the last layer scale with $h$.
no code implementations • 26 May 2019 • Stefano Spigler, Mario Geiger, Matthieu Wyart
We extract $a$ from real data by performing kernel PCA, leading to $\beta\approx0. 36$ for MNIST and $\beta\approx0. 07$ for CIFAR10, in good agreement with observations.
2 code implementations • 16 Apr 2019 • Tom W. J. de Geus, Marko Popović, Wencheng Ji, Alberto Rosso, Matthieu Wyart
Sliding at a quasi-statically loaded frictional interface occurs via macroscopic slip events, which nucleate locally before propagating as rupture fronts very similar to fracture.
Disordered Systems and Neural Networks Statistical Mechanics
1 code implementation • 6 Jan 2019 • Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stéphane d'Ascoli, Giulio Biroli, Clément Hongler, Matthieu Wyart
At this threshold, we argue that $\|f_{N}\|$ diverges.
no code implementations • 22 Oct 2018 • Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart
We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.
2 code implementations • 25 Sep 2018 • Mario Geiger, Stefano Spigler, Stéphane d'Ascoli, Levent Sagun, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart
In the vicinity of this transition, properties of the curvature of the minima of the loss are critical.
no code implementations • ICML 2018 • Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.