Search Results for author: Hugo Cui

Found 14 papers, 7 papers with code

Fundamental limits of learning in sequence multi-index models and deep attention networks: High-dimensional asymptotics and sharp thresholds

1 code implementation2 Feb 2025 Emanuele Troiani, Hugo Cui, Yatin Dandi, Florent Krzakala, Lenka Zdeborová

In this manuscript, we study the learning of deep attention neural networks, defined as the composition of multiple self-attention layers, with tied and low-rank weights.

Deep Attention

A precise asymptotic analysis of learning diffusion models: theory and insights

1 code implementation7 Jan 2025 Hugo Cui, Cengiz Pehlevan, Yue M. Lu

In this manuscript, we consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder, trained with online stochastic gradient descent, on a high-dimensional target density with an underlying low-dimensional manifold structure.

A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities

no code implementations24 Oct 2024 Yatin Dandi, Luca Pesce, Hugo Cui, Florent Krzakala, Yue M. Lu, Bruno Loureiro

This provides a sharp description of the impact of feature learning in the generalization of two-layer neural networks, beyond the random features and lazy training regimes.

High-dimensional learning of narrow neural networks

no code implementations20 Sep 2024 Hugo Cui

Recent years have been marked with the fast-pace diversification and increasing ubiquity of machine learning applications.

Contrastive Learning Denoising

Asymptotics of Learning with Deep Structured (Random) Features

1 code implementation21 Feb 2024 Dominik Schröder, Daniil Dmitriev, Hugo Cui, Bruno Loureiro

For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large.

Asymptotics of feature learning in two-layer networks after one gradient-step

1 code implementation7 Feb 2024 Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

In this manuscript, we investigate the problem of how two-layer neural networks learn features from data, and improve over the kernel regime, after being trained with a single gradient descent step.

A phase transition between positional and semantic learning in a solvable model of dot-product attention

no code implementations6 Feb 2024 Hugo Cui, Freya Behrens, Florent Krzakala, Lenka Zdeborová

Many empirical studies have provided evidence for the emergence of algorithmic mechanisms (abilities) in the learning of language models, that lead to qualitative improvements of the model capabilities.

Analysis of learning a flow-based generative model from limited sample complexity

1 code implementation5 Oct 2023 Hugo Cui, Florent Krzakala, Eric Vanden-Eijnden, Lenka Zdeborová

We study the problem of training a flow-based generative model, parametrized by a two-layer autoencoder, to sample from a high-dimensional Gaussian mixture.

Denoising Form

Deterministic equivalent and error universality of deep random features learning

1 code implementation1 Feb 2023 Dominik Schröder, Hugo Cui, Daniil Dmitriev, Bruno Loureiro

Establishing this result requires proving a deterministic equivalent for traces of the deep random features sample covariance matrices which can be of independent interest.

Bayes-optimal Learning of Deep Random Networks of Extensive-width

no code implementations1 Feb 2023 Hugo Cui, Florent Krzakala, Lenka Zdeborová

We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights.

Form regression

Error Scaling Laws for Kernel Classification under Source and Capacity Conditions

no code implementations29 Jan 2022 Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

We find that our rates tightly describe the learning curves for this class of data sets, and are also observed on real data.

Classification Prediction

Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

no code implementations NeurIPS 2021 Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

In this work, we unify and extend this line of work, providing characterization of all regimes and excess error decay rates that can be observed in terms of the interplay of noise and regularization.

regression

Learning curves of generic features maps for realistic datasets with a teacher-student model

1 code implementation NeurIPS 2021 Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework.

Large deviations for the perceptron model and consequences for active learning

no code implementations9 Dec 2019 Hugo Cui, Luca Saglietti, Lenka Zdeborová

These large deviations then provide optimal achievable performance boundaries for any active learning algorithm.

Active Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.