Search Results for author: Alberto Bietti

Found 31 papers, 17 papers with code

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum Structure

1 code implementation • NeurIPS 2017 • Alberto Bietti, Julien Mairal

Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions.

Data Augmentation Stochastic Optimization

Paper
Code

Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations

1 code implementation • 9 Jun 2017 • Alberto Bietti, Julien Mairal

The success of deep convolutional architectures is often attributed in part to their ability to learn multiscale and invariant representations of natural signals.

Generalization Bounds

Paper
Code

Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure

no code implementations • NeurIPS 2017 • Alberto Bietti, Julien Mairal

Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions.

Data Augmentation Stochastic Optimization

Paper
Add Code

Invariance and Stability of Deep Convolutional Representations

no code implementations • NeurIPS 2017 • Alberto Bietti, Julien Mairal

In this paper, we study deep signal representations that are near-invariant to groups of transformations and stable to the action of diffeomorphisms without losing signal information.

Paper
Add Code

A Contextual Bandit Bake-off

1 code implementation • 12 Feb 2018 • Alberto Bietti, Alekh Agarwal, John Langford

Contextual bandit algorithms are essential for solving many real-world interactive machine learning problems.

Paper
Code

A Kernel Perspective for Regularizing Deep Neural Networks

1 code implementation • 30 Sep 2018 • Alberto Bietti, Grégoire Mialon, Dexiong Chen, Julien Mairal

We propose a new point of view for regularizing deep neural networks by using the norm of a reproducing kernel Hilbert space (RKHS).

Paper
Code

On the Inductive Bias of Neural Tangent Kernels

1 code implementation • NeurIPS 2019 • Alberto Bietti, Julien Mairal

State-of-the-art neural networks are heavily over-parameterized, making the optimization algorithm a crucial ingredient for learning predictive models with good generalization properties.

Inductive Bias

Paper
Code

Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation

1 code implementation • 22 Apr 2020 • Houssam Zenati, Alberto Bietti, Matthieu Martin, Eustache Diemert, Pierre Gaillard, Julien Mairal

Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare.

counterfactual Counterfactual Reasoning +1

Paper
Code

Convergence and Stability of Graph Convolutional Networks on Large Random Graphs

1 code implementation • NeurIPS 2020 • Nicolas Keriven, Alberto Bietti, Samuel Vaiter

We study properties of Graph Convolutional Networks (GCNs) by analyzing their behavior on standard models of random graphs, where nodes are represented by random latent variables and edges are drawn according to a similarity kernel.

valid

Paper
Code

Deep Equals Shallow for ReLU Networks in Kernel Regimes

1 code implementation • ICLR 2021 • Alberto Bietti, Francis Bach

Deep networks are often considered to be more expressive than shallow ones in terms of approximation.

Paper
Code

Approximation and Learning with Deep Convolutional Models: a Kernel Perspective

1 code implementation • ICLR 2022 • Alberto Bietti

The empirical success of deep convolutional networks on tasks involving high-dimensional data such as images or audio suggests that they can efficiently approximate certain functions that are well-suited for such tasks.

Additive models Generalization Bounds +1

Paper
Code

On Energy-Based Models with Overparametrized Shallow Neural Networks

1 code implementation • 15 Apr 2021 • Carles Domingo-Enrich, Alberto Bietti, Eric Vanden-Eijnden, Joan Bruna

Energy-based models (EBMs) are a simple yet powerful framework for generative modeling.

Paper
Code

On the Universality of Graph Neural Networks on Large Random Graphs

1 code implementation • NeurIPS 2021 • Nicolas Keriven, Alberto Bietti, Samuel Vaiter

In the large graph limit, GNNs are known to converge to certain "continuous" models known as c-GNNs, which directly enables a study of their approximation power on random graph models.

Stochastic Block Model

Paper
Code

On the Sample Complexity of Learning under Invariance and Geometric Stability

no code implementations • 14 Jun 2021 • Alberto Bietti, Luca Venturi, Joan Bruna

Many supervised learning problems involve high-dimensional data such as images, text, or graphs.

valid

Paper
Add Code

Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

no code implementations • 11 Jul 2021 • Carles Domingo-Enrich, Alberto Bietti, Marylou Gabrié, Joan Bruna, Eric Vanden-Eijnden

In the feature-learning regime, this dual formulation justifies using a two time-scale gradient ascent-descent (GDA) training algorithm in which one updates concurrently the particles in the sample space and the neurons in the parameter space of the energy.

Paper
Add Code

On the Sample Complexity of Learning under Geometric Stability

no code implementations • NeurIPS 2021 • Alberto Bietti, Luca Venturi, Joan Bruna

Many supervised learning problems involve high-dimensional data such as images, text, or graphs.

valid

Paper
Add Code

Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning

1 code implementation • 10 Feb 2022 • Alberto Bietti, Chen-Yu Wei, Miroslav Dudík, John Langford, Zhiwei Steven Wu

Large-scale machine learning systems often involve data distributed across a collection of users.

Personalized Federated Learning Stochastic Optimization

Paper
Code

Efficient Kernel UCB for Contextual Bandits

1 code implementation • 11 Feb 2022 • Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard

While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems.

Computational Efficiency Multi-Armed Bandits

Paper
Code

On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes

no code implementations • 22 Mar 2022 • Elvis Dohmatob, Alberto Bietti

To better understand these factors, we provide a precise study of the adversarial robustness in different scenarios, from initialization to the end of training in different regimes, as well as intermediate scenarios, where initialization still plays a role due to "lazy" training.

Adversarial Robustness

Paper
Add Code

When does return-conditioned supervised learning work for offline reinforcement learning?

1 code implementation • 2 Jun 2022 • David Brandfonbrener, Alberto Bietti, Jacob Buckman, Romain Laroche, Joan Bruna

Several recent works have proposed a class of algorithms for the offline reinforcement learning (RL) problem that we will refer to as return-conditioned supervised learning (RCSL).

D4RL reinforcement-learning +1

Paper
Code

Learning Single-Index Models with Shallow Neural Networks

no code implementations • 27 Oct 2022 • Alberto Bietti, Joan Bruna, Clayton Sanford, Min Jae Song

Single-index models are a class of functions given by an unknown univariate ``link'' function applied to an unknown one-dimensional projection of the input.

Paper
Add Code

On minimal variations for unsupervised representation learning

no code implementations • 7 Nov 2022 • Vivien Cabannes, Alberto Bietti, Randall Balestriero

Unsupervised representation learning aims at describing raw data efficiently to solve various downstream tasks.

Representation Learning Self-Supervised Learning

Paper
Add Code

The SSL Interplay: Augmentations, Inductive Bias, and Generalization

no code implementations • 6 Feb 2023 • Vivien Cabannes, Bobak T. Kiani, Randall Balestriero, Yann Lecun, Alberto Bietti

Self-supervised learning (SSL) has emerged as a powerful framework to learn representations from raw data without supervision.

Data Augmentation Inductive Bias +1

Paper
Add Code

AstroCLIP: Cross-Modal Pre-Training for Astronomical Foundation Models

1 code implementation • 4 Oct 2023 • Francois Lanusse, Liam Parker, Siavash Golkar, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Geraud Krawezik, Michael McCabe, Ruben Ohana, Mariel Pettee, Bruno Regaldo-Saint Blancard, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

We present AstroCLIP, a strategy to facilitate the construction of astronomical foundation models that bridge the gap between diverse observational modalities.

Contrastive Learning

Paper
Code

Multiple Physics Pretraining for Physical Surrogate Models

1 code implementation • 4 Oct 2023 • Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Holden Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Geraud Krawezik, Francois Lanusse, Mariel Pettee, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

We introduce multiple physics pretraining (MPP), an autoregressive task-agnostic pretraining approach for physical surrogate modeling.

Paper
Code

xVal: A Continuous Number Encoding for Large Language Models

2 code implementations • 4 Oct 2023 • Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, Miles Cranmer, Geraud Krawezik, Francois Lanusse, Michael McCabe, Ruben Ohana, Liam Parker, Bruno Régaldo-Saint Blancard, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

Large Language Models have not yet been broadly adapted for the analysis of scientific datasets due in part to the unique difficulties of tokenizing numbers.

Inductive Bias

323

Paper
Code

Scaling Laws for Associative Memories

no code implementations • 4 Oct 2023 • Vivien Cabannes, Elvis Dohmatob, Alberto Bietti

Learning arguably involves the discovery and memorization of abstract rules.

Memorization

Paper
Add Code

On Learning Gaussian Multi-index Models with Gradient Flow

no code implementations • 30 Oct 2023 • Alberto Bietti, Joan Bruna, Loucas Pillaud-Vivien

We study gradient flow on the multi-index regression problem for high-dimensional Gaussian data.

Paper
Add Code

Learning Associative Memories with Gradient Descent

no code implementations • 28 Feb 2024 • Vivien Cabannes, Berfin Simsek, Alberto Bietti

This work focuses on the training dynamics of one associative memory module storing outer products of token embeddings.

Memorization

Paper
Add Code

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models

no code implementations • 29 Feb 2024 • Frederik Kunstner, Robin Yadav, Alan Milligan, Mark Schmidt, Alberto Bietti

We show that the heavy-tailed class imbalance found in language modeling tasks leads to difficulties in the optimization dynamics.

Language Modelling

Paper
Add Code

Level Set Teleportation: An Optimization Perspective

no code implementations • 5 Mar 2024 • Aaron Mishkin, Alberto Bietti, Robert M. Gower

We study level set teleportation, an optimization sub-routine which seeks to accelerate gradient methods by maximizing the gradient norm on a level-set of the objective function.

LEMMA

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.