Search Results for author: Matthieu Wyart

Found 23 papers, 13 papers with code

How Deep Networks Learn Sparse and Hierarchical Data: the Sparse Random Hierarchy Model

no code implementations • 16 Apr 2024 • Umberto Tomasini, Matthieu Wyart

Understanding what makes high-dimensional data learnable is a fundamental question in machine learning.

Paper
Add Code

A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data

1 code implementation • 26 Feb 2024 • Antonio Sclocchi, Alessandro Favero, Matthieu Wyart

We find that the backward diffusion process acting after a time $t$ is governed by a phase transition at some threshold time, where the probability of reconstructing high-level features, like the class of an image, suddenly drops.

Paper
Code

On the different regimes of Stochastic Gradient Descent

1 code implementation • 19 Sep 2023 • Antonio Sclocchi, Matthieu Wyart

For small $B$ and large $\eta$, SGD corresponds to a stochastic evolution of the parameters, whose noise amplitude is governed by the ''temperature'' $T\equiv \eta/B$.

Paper
Code

How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model

1 code implementation • 5 Jul 2023 • Francesco Cagnetta, Leonardo Petrini, Umberto M. Tomasini, Alessandro Favero, Matthieu Wyart

The model is a classification task where each class corresponds to a group of high-level features, chosen among several equivalent groups associated with the same class.

Paper
Code

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning

no code implementations • 31 Jan 2023 • Antonio Sclocchi, Mario Geiger, Matthieu Wyart

They show that SGD noise can be detrimental or instead useful depending on the training regime.

Paper
Add Code

How deep convolutional neural networks lose spatial information with training

1 code implementation • 4 Oct 2022 • Umberto M. Tomasini, Leonardo Petrini, Francesco Cagnetta, Matthieu Wyart

Here, we (i) show empirically for various architectures that stability to image diffeomorphisms is achieved by both spatial and channel pooling, (ii) introduce a model scale-detection task which reproduces our empirical observations on spatial pooling and (iii) compute analitically how the sensitivity to diffeomorphisms and noise scales with depth due to spatial pooling.

Paper
Code

What Can Be Learnt With Wide Convolutional Neural Networks?

1 code implementation • 1 Aug 2022 • Francesco Cagnetta, Alessandro Favero, Matthieu Wyart

Interestingly, we find that, despite their hierarchical structure, the functions generated by infinitely-wide deep CNNs are too rich to be efficiently learnable in high dimension.

Paper
Code

Learning sparse features can lead to overfitting in neural networks

1 code implementation • 24 Jun 2022 • Leonardo Petrini, Francesco Cagnetta, Eric Vanden-Eijnden, Matthieu Wyart

It is widely believed that the success of deep networks lies in their ability to learn a meaningful representation of the features of the data.

Paper
Code

Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data

no code implementations • 7 Feb 2022 • Umberto M. Tomasini, Antonio Sclocchi, Matthieu Wyart

Recently, several theories including the replica method made predictions for the generalization error of Kernel Ridge Regression.

Paper
Add Code

Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

no code implementations • NeurIPS 2021 • Alessandro Favero, Francesco Cagnetta, Matthieu Wyart

Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge.

regression

Paper
Add Code

How memory architecture affects learning in a simple POMDP: the two-hypothesis testing problem

1 code implementation • 16 Jun 2021 • Mario Geiger, Christophe Eloy, Matthieu Wyart

Reinforcement learning is generally difficult for partially observable Markov decision processes (POMDPs), which occurs when the agent's observation is partial or noisy.

Paper
Code

Relative stability toward diffeomorphisms indicates performance in deep nets

2 code implementations • NeurIPS 2021 • Leonardo Petrini, Alessandro Favero, Mario Geiger, Matthieu Wyart

Understanding why deep nets can classify data in large dimensions remains a challenge.

Image Classification

Paper
Code

Non-local effects reflect the jamming criticality in granular flows of frictionless particles

no code implementations • 5 Jan 2021 • Hugo Perrin, Matthieu Wyart, Bloen Metzger, Yoël Forterre

The jamming transition is accompanied by a rich phenomenology, such as hysteresis or non-local effects, which is still not well understood.

Soft Condensed Matter

Paper
Add Code

Perspective: A Phase Diagram for Deep Learning unifying Jamming, Feature Learning and Lazy Training

1 code implementation • 30 Dec 2020 • Mario Geiger, Leonardo Petrini, Matthieu Wyart

In this manuscript, we review recent results elucidating (i, ii) and the perspective they offer on the (still unexplained) curse of dimensionality paradox.

Paper
Code

Geometric compression of invariant manifolds in neural nets

1 code implementation • 22 Jul 2020 • Jonas Paccolat, Leonardo Petrini, Mario Geiger, Kevin Tyloo, Matthieu Wyart

We confirm these predictions both for a one-hidden layer FC network trained on the stripe model and for a 16-layers CNN trained on MNIST, for which we also find $\beta_\text{Feature}>\beta_\text{Lazy}$.

Paper
Code

How isotropic kernels perform on simple invariants

no code implementations • 17 Jun 2020 • Jonas Paccolat, Stefano Spigler, Matthieu Wyart

(ii) Next we consider support-vector binary classification and introduce the stripe model where the data label depends on a single coordinate $y(\underline{x}) = y(x_1)$, corresponding to parallel decision boundaries separating labels of different signs, and consider that there is no margin at these interfaces.

Binary Classification regression

Paper
Add Code

Disentangling feature and lazy training in deep neural networks

no code implementations • 19 Jun 2019 • Mario Geiger, Stefano Spigler, Arthur Jacot, Matthieu Wyart

Two distinct limits for deep learning have been derived as the network width $h\rightarrow \infty$, depending on how the weights of the last layer scale with $h$.

Paper
Add Code

Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm

no code implementations • 26 May 2019 • Stefano Spigler, Mario Geiger, Matthieu Wyart

We extract $a$ from real data by performing kernel PCA, leading to $\beta\approx0. 36$ for MNIST and $\beta\approx0. 07$ for CIFAR10, in good agreement with observations.

regression

Paper
Add Code

How collective asperity detachments nucleate slip at frictional interfaces

2 code implementations • 16 Apr 2019 • Tom W. J. de Geus, Marko Popović, Wencheng Ji, Alberto Rosso, Matthieu Wyart

Sliding at a quasi-statically loaded frictional interface occurs via macroscopic slip events, which nucleate locally before propagating as rupture fronts very similar to fracture.

Disordered Systems and Neural Networks Statistical Mechanics

Paper
Code

Scaling description of generalization with number of parameters in deep learning

1 code implementation • 6 Jan 2019 • Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stéphane d'Ascoli, Giulio Biroli, Clément Hongler, Matthieu Wyart

At this threshold, we argue that $\|f_{N}\|$ diverges.

1,188

Paper
Code

A jamming transition from under- to over-parametrization affects loss landscape and generalization

no code implementations • 22 Oct 2018 • Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.

Paper
Add Code

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

2 code implementations • 25 Sep 2018 • Mario Geiger, Stefano Spigler, Stéphane d'Ascoli, Levent Sagun, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart

In the vicinity of this transition, properties of the curvature of the minima of the loss are critical.

Paper
Code

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations • ICML 2018 • Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.