Search Results for author: Fabian Pedregosa

Found 40 papers, 13 papers with code

Acceleration through spectral density estimation

no code implementations ICML 2020 Fabian Pedregosa, Damien Scieur

We develop a framework for designing optimal optimization methods in terms of their average-case runtime.

Density Estimation regression

Universal Asymptotic Optimality of Polyak Momentum

no code implementations ICML 2020 Damien Scieur, Fabian Pedregosa

We consider the average-case runtime analysis of algorithms for minimizing quadratic objectives.

Stability-Aware Training of Neural Network Interatomic Potentials with Differentiable Boltzmann Estimators

1 code implementation21 Feb 2024 Sanjeev Raja, Ishan Amin, Fabian Pedregosa, Aditi S. Krishnapriyan

As a general framework applicable across NNIP architectures and systems, StABlE Training is a powerful tool for training stable and accurate NNIPs, particularly in the absence of large reference datasets.

On the Interplay Between Stepsize Tuning and Progressive Sharpening

no code implementations30 Nov 2023 Vincent Roulet, Atish Agarwala, Fabian Pedregosa

Recent empirical work has revealed an intriguing property of deep learning models by which the sharpness (largest eigenvalue of the Hessian) increases throughout optimization until it stabilizes around a critical value at which the optimizer operates at the edge of stability, given a fixed stepsize (Cohen et al, 2022).

A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces

no code implementations8 Dec 2022 Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare

In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is represented by a neural network, and hence can be scaled to datasets with an effectively infinite number of rows and columns.

Image Compression reinforcement-learning +1

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

no code implementations9 Nov 2022 Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa

The extragradient method has gained popularity due to its robust convergence properties for differentiable games.

Second-order regression models exhibit progressive sharpening to the edge of stability

no code implementations10 Oct 2022 Atish Agarwala, Fabian Pedregosa, Jeffrey Pennington

Recent studies of gradient descent with large step sizes have shown that there is often a regime with an initial increase in the largest eigenvalue of the loss Hessian (progressive sharpening), followed by a stabilization of the eigenvalue near the maximum value which allows convergence (edge of stability).

regression

The Curse of Unrolling: Rate of Differentiating Through Optimization

no code implementations27 Sep 2022 Damien Scieur, Quentin Bertrand, Gauthier Gidel, Fabian Pedregosa

Computing the Jacobian of the solution of an optimization problem is a central problem in machine learning, with applications in hyperparameter optimization, meta-learning, optimization as a layer, and dataset distillation, to name a few.

Hyperparameter Optimization Meta-Learning +1

Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime

no code implementations20 Jun 2022 Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette

The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results.

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

no code implementations24 Feb 2022 Robert M. Gower, Mathieu Blondel, Nidham Gazagnadou, Fabian Pedregosa

We use this insight to develop new variants of the SPS method that are better suited to nonlinear models.

GradMax: Growing Neural Networks using Gradient Information

1 code implementation ICLR 2022 Utku Evci, Bart van Merriënboer, Thomas Unterthiner, Max Vladymyrov, Fabian Pedregosa

The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified.

Efficient and Modular Implicit Differentiation

1 code implementation NeurIPS 2021 Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert

In this paper, we propose automatic implicit differentiation, an efficient and modular approach for implicit differentiation of optimization problems.

Meta-Learning

Boosting Variational Inference With Locally Adaptive Step-Sizes

no code implementations19 May 2021 Gideon Dresdner, Saurav Shekhar, Fabian Pedregosa, Francesco Locatello, Gunnar Rätsch

Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution.

Variational Inference

Bridging the Gap Between Adversarial Robustness and Optimization Bias

1 code implementation17 Feb 2021 Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.

Adversarial Robustness

SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality

no code implementations8 Feb 2021 Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette

We propose a new framework, inspired by random matrix theory, for analyzing the dynamics of stochastic gradient descent (SGD) when both number of samples and dimensions are large.

Average-case Acceleration for Bilinear Games and Normal Matrices

no code implementations ICLR 2021 Carles Domingo-Enrich, Fabian Pedregosa, Damien Scieur

First, we show that for zero-sum bilinear games the average-case optimal method is the optimal method for the minimization of the Hamiltonian.

Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis

no code implementations8 Jun 2020 Courtney Paquette, Bart van Merriënboer, Elliot Paquette, Fabian Pedregosa

In fact, the halting time exhibits a universality property: it is independent of the probability distribution.

The Geometry of Sign Gradient Descent

no code implementations ICLR 2020 Lukas Balles, Fabian Pedregosa, Nicolas Le Roux

Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training.

Distributed Optimization

Average-case Acceleration Through Spectral Density Estimation

no code implementations12 Feb 2020 Fabian Pedregosa, Damien Scieur

We develop a framework for the average-case analysis of random quadratic problems and derive algorithms that are optimal under this analysis.

Density Estimation regression

A Test for Shared Patterns in Cross-modal Brain Activation Analysis

1 code implementation8 Oct 2019 Elena Kalinina, Fabian Pedregosa, Vittorio Iacovella, Emanuele Olivetti, Paolo Avesani

In the last decade, the identification of shared activity patterns has been mostly framed as a supervised learning problem.

Two-sample testing

The Difficulty of Training Sparse Neural Networks

no code implementations ICML Workshop Deep_Phenomen 2019 Utku Evci, Fabian Pedregosa, Aidan Gomez, Erich Elsen

Additionally, our attempts to find a decreasing objective path from "bad" solutions to the "good" ones in the sparse subspace fail.

On the interplay between noise and curvature and its effect on optimization and generalization

no code implementations18 Jun 2019 Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux

The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients.

Variance Reduced Three Operator Splitting

1 code implementation19 Jun 2018 Fabian Pedregosa, Kilian Fatras, Mattia Casotto

This is due to the fact that existing methods require to evaluate the proximity operator for the nonsmooth terms, which can be a costly operation for complex penalties.

Optimization and Control 65K10

Frank-Wolfe Splitting via Augmented Lagrangian Method

no code implementations9 Apr 2018 Gauthier Gidel, Fabian Pedregosa, Simon Lacoste-Julien

In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints.

Adaptive Three Operator Splitting

no code implementations ICML 2018 Fabian Pedregosa, Gauthier Gidel

We propose and analyze an adaptive step-size variant of the Davis-Yin three operator splitting.

Frank-Wolfe with Subsampling Oracle

no code implementations ICML 2018 Thomas Kerdreux, Fabian Pedregosa, Alexandre d'Aspremont

The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart.

Improved asynchronous parallel optimization analysis for stochastic incremental methods

no code implementations11 Jan 2018 Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

Notably, we prove that ASAGA and KROMAGNON can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions.

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

1 code implementation NeurIPS 2017 Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien

Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures.

On the convergence rate of the three operator splitting scheme

no code implementations25 Oct 2016 Fabian Pedregosa

The three operator splitting scheme was recently proposed by [Davis and Yin, 2015] as a method to optimize composite objective functions with one convex smooth term and two convex (possibly non-smooth) terms for which we have access to their proximity operator.

ASAGA: Asynchronous Parallel SAGA

1 code implementation15 Jun 2016 Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates.

Hyperparameter optimization with approximate gradient

1 code implementation7 Feb 2016 Fabian Pedregosa

Most models in machine learning contain at least one hyperparameter to control for model complexity.

Hyperparameter Optimization regression

On the Consistency of Ordinal Regression Methods

no code implementations11 Aug 2014 Fabian Pedregosa, Francis Bach, Alexandre Gramfort

We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classification.

Binary Classification General Classification +1

Data-driven HRF estimation for encoding and decoding models

no code implementations27 Feb 2014 Fabian Pedregosa, Michael Eickenberg, Philippe Ciuciu, Bertrand Thirion, Alexandre Gramfort

We develop a method for the joint estimation of activation and HRF using a rank constraint causing the estimated HRF to be equal across events/conditions, yet permitting it to be different across voxels.

Computational Efficiency

Second order scattering descriptors predict fMRI activity due to visual textures

no code implementations10 Aug 2013 Michael Eickenberg, Fabian Pedregosa, Senoussi Mehdi, Alexandre Gramfort, Bertrand Thirion

Second layer scattering descriptors are known to provide good classification performance on natural quasi-stationary processes such as visual textures due to their sensitivity to higher order moments and continuity with respect to small deformations.

General Classification

HRF estimation improves sensitivity of fMRI encoding and decoding models

no code implementations13 May 2013 Fabian Pedregosa, Michael Eickenberg, Bertrand Thirion, Alexandre Gramfort

Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal.

Cannot find the paper you are looking for? You can Submit a new open access paper.