Search Results for author: Matthieu Wyart

Found 23 papers, 13 papers with code

How Deep Networks Learn Sparse and Hierarchical Data: the Sparse Random Hierarchy Model

no code implementations16 Apr 2024 Umberto Tomasini, Matthieu Wyart

Understanding what makes high-dimensional data learnable is a fundamental question in machine learning.

A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data

1 code implementation26 Feb 2024 Antonio Sclocchi, Alessandro Favero, Matthieu Wyart

We find that the backward diffusion process acting after a time $t$ is governed by a phase transition at some threshold time, where the probability of reconstructing high-level features, like the class of an image, suddenly drops.

On the different regimes of Stochastic Gradient Descent

1 code implementation19 Sep 2023 Antonio Sclocchi, Matthieu Wyart

For small $B$ and large $\eta$, SGD corresponds to a stochastic evolution of the parameters, whose noise amplitude is governed by the ''temperature'' $T\equiv \eta/B$.

How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

1 code implementation5 Jul 2023 Francesco Cagnetta, Leonardo Petrini, Umberto M. Tomasini, Alessandro Favero, Matthieu Wyart

The model is a classification task where each class corresponds to a group of high-level features, chosen among several equivalent groups associated with the same class.

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

no code implementations31 Jan 2023 Antonio Sclocchi, Mario Geiger, Matthieu Wyart

They show that SGD noise can be detrimental or instead useful depending on the training regime.

How deep convolutional neural networks lose spatial information with training

1 code implementation4 Oct 2022 Umberto M. Tomasini, Leonardo Petrini, Francesco Cagnetta, Matthieu Wyart

Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by both spatial and channel pooling, (ii) introduce a model scale-detection task which reproduces our empirical observations on spatial pooling and (iii) compute analitically how the sensitivity to diffeomorphisms and noise scales with depth due to spatial pooling.

What Can Be Learnt With Wide Convolutional Neural Networks?

1 code implementation1 Aug 2022 Francesco Cagnetta, Alessandro Favero, Matthieu Wyart

Interestingly, we find that, despite their hierarchical structure, the functions generated by infinitely-wide deep CNNs are too rich to be efficiently learnable in high dimension.

Learning sparse features can lead to overfitting in neural networks

1 code implementation24 Jun 2022 Leonardo Petrini, Francesco Cagnetta, Eric Vanden-Eijnden, Matthieu Wyart

It is widely believed that the success of deep networks lies in their ability to learn a meaningful representation of the features of the data.

Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data

no code implementations7 Feb 2022 Umberto M. Tomasini, Antonio Sclocchi, Matthieu Wyart

Recently, several theories including the replica method made predictions for the generalization error of Kernel Ridge Regression.

Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

no code implementations NeurIPS 2021 Alessandro Favero, Francesco Cagnetta, Matthieu Wyart

Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge.

regression

How memory architecture affects learning in a simple POMDP: the two-hypothesis testing problem

1 code implementation16 Jun 2021 Mario Geiger, Christophe Eloy, Matthieu Wyart

Reinforcement learning is generally difficult for partially observable Markov decision processes (POMDPs), which occurs when the agent's observation is partial or noisy.

Non-local effects reflect the jamming criticality in granular flows of frictionless particles

no code implementations5 Jan 2021 Hugo Perrin, Matthieu Wyart, Bloen Metzger, Yoël Forterre

The jamming transition is accompanied by a rich phenomenology, such as hysteresis or non-local effects, which is still not well understood.

Soft Condensed Matter

Perspective: A Phase Diagram for Deep Learning unifying Jamming, Feature Learning and Lazy Training

1 code implementation30 Dec 2020 Mario Geiger, Leonardo Petrini, Matthieu Wyart

In this manuscript, we review recent results elucidating (i, ii) and the perspective they offer on the (still unexplained) curse of dimensionality paradox.

Geometric compression of invariant manifolds in neural nets

1 code implementation22 Jul 2020 Jonas Paccolat, Leonardo Petrini, Mario Geiger, Kevin Tyloo, Matthieu Wyart

We confirm these predictions both for a one-hidden layer FC network trained on the stripe model and for a 16-layers CNN trained on MNIST, for which we also find $\beta_\text{Feature}>\beta_\text{Lazy}$.

How isotropic kernels perform on simple invariants

no code implementations17 Jun 2020 Jonas Paccolat, Stefano Spigler, Matthieu Wyart

(ii) Next we consider support-vector binary classification and introduce the stripe model where the data label depends on a single coordinate $y(\underline{x}) = y(x_1)$, corresponding to parallel decision boundaries separating labels of different signs, and consider that there is no margin at these interfaces.

Binary Classification regression

Disentangling feature and lazy training in deep neural networks

no code implementations19 Jun 2019 Mario Geiger, Stefano Spigler, Arthur Jacot, Matthieu Wyart

Two distinct limits for deep learning have been derived as the network width $h\rightarrow \infty$, depending on how the weights of the last layer scale with $h$.

Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm

no code implementations26 May 2019 Stefano Spigler, Mario Geiger, Matthieu Wyart

We extract $a$ from real data by performing kernel PCA, leading to $\beta\approx0. 36$ for MNIST and $\beta\approx0. 07$ for CIFAR10, in good agreement with observations.

regression

How collective asperity detachments nucleate slip at frictional interfaces

2 code implementations16 Apr 2019 Tom W. J. de Geus, Marko Popović, Wencheng Ji, Alberto Rosso, Matthieu Wyart

Sliding at a quasi-statically loaded frictional interface occurs via macroscopic slip events, which nucleate locally before propagating as rupture fronts very similar to fracture.

Disordered Systems and Neural Networks Statistical Mechanics

A jamming transition from under- to over-parametrization affects loss landscape and generalization

no code implementations22 Oct 2018 Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations ICML 2018 Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

Cannot find the paper you are looking for? You can Submit a new open access paper.