no code implementations • 5 Mar 2024 • Tomás González, Cristóbal Guzmán, Courtney Paquette
For convex-concave and first-order-smooth stochastic objectives, our algorithms attain a rate of $\sqrt{\log(d)/n} + (\log(d)^{3/2}/[n\varepsilon])^{1/3}$, where $d$ is the dimension of the problem and $n$ the dataset size.
no code implementations • 8 Feb 2024 • Pierre Marion, Anna Korba, Peter Bartlett, Mathieu Blondel, Valentin De Bortoli, Arnaud Doucet, Felipe Llinares-López, Courtney Paquette, Quentin Berthet
We present a new algorithm to optimize distributions defined implicitly by parameterized stochastic diffusions.
no code implementations • 17 Aug 2023 • Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, Inbar Seroussi
In addition to the deterministic equivalent, we introduce an SDE with a simplified diffusion coefficient (homogenized SGD) which allows us to analyze the dynamics of general statistics of SGD iterates.
no code implementations • 20 Jun 2022 • Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette
The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results.
no code implementations • 15 Jun 2022 • Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington
Stochastic gradient descent (SGD) is a pillar of modern machine learning, serving as the go-to optimization algorithm for a diverse array of problems.
no code implementations • 2 Jun 2022 • Kiwon Lee, Andrew N. Cheng, Courtney Paquette, Elliot Paquette
We analyze the dynamics of large batch stochastic gradient descent with momentum (SGD+M) on the least squares problem when both the number of samples and dimensions are large.
no code implementations • 14 May 2022 • Courtney Paquette, Elliot Paquette, Ben Adlam, Jeffrey Pennington
By analyzing homogenized SGD, we provide exact non-asymptotic high-dimensional expressions for the generalization performance of SGD in terms of a solution of a Volterra integral equation.
no code implementations • NeurIPS 2021 • Courtney Paquette, Elliot Paquette
We analyze a class of stochastic gradient algorithms with momentum on a high-dimensional random least squares problem.
no code implementations • 8 Feb 2021 • Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette
We propose a new framework, inspired by random matrix theory, for analyzing the dynamics of stochastic gradient descent (SGD) when both number of samples and dimensions are large.
no code implementations • 8 Jun 2020 • Courtney Paquette, Bart van Merriënboer, Elliot Paquette, Fabian Pedregosa
In fact, the halting time exhibits a universality property: it is independent of the probability distribution.
no code implementations • 23 Mar 2020 • Sina Baghal, Courtney Paquette, Stephen A. Vavasis
We propose a new, simple, and computationally inexpensive termination test for constant step-size stochastic gradient descent (SGD) applied to binary classification on the logistic and hinge loss with homogeneous linear predictors.
no code implementations • 31 Mar 2017 • Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui
We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions.