1 code implementation • 9 Jan 2024 • Adityanarayanan Radhakrishnan, Mikhail Belkin, Dmitriy Drusvyatskiy
A possible explanation is that common training algorithms for neural networks implicitly perform dimensionality reduction - a process called feature learning.
no code implementations • 16 Jan 2023 • Damek Davis, Dmitriy Drusvyatskiy, Liwei Jiang
In their seminal work, Polyak and Juditsky showed that stochastic approximation algorithms for solving smooth equations enjoy a central limit theorem.
1 code implementation • 9 Jul 2022 • Joshua Cutler, Mateo Díaz, Dmitriy Drusvyatskiy
We show that under mild assumptions, the deviation between the average iterate of the algorithm and the solution is asymptotically normal, with a covariance that clearly decouples the effects of the gradient noise and the distributional shift.
no code implementations • 8 Apr 2022 • Mitas Ray, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J. Ratliff
This paper studies the problem of expected loss minimization given a data distribution that is dependent on the decision-maker's action and evolves dynamically in time according to a geometric decay process.
no code implementations • 7 Mar 2022 • Lijun Ding, Dmitriy Drusvyatskiy, Maryam Fazel, Zaid Harchaoui
Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance.
no code implementations • 10 Jan 2022 • Adhyyan Narang, Evan Faulkner, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J. Ratliff
We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and the repeated (stochastic) gradient method.
no code implementations • 26 Aug 2021 • Damek Davis, Dmitriy Drusvyatskiy, Liwei Jiang
We show that the subgradient method converges only to local minimizers when applied to generic Lipschitz continuous and subdifferentially regular functions that are definable in an o-minimal structure.
1 code implementation • NeurIPS 2021 • Joshua Cutler, Dmitriy Drusvyatskiy, Zaid Harchaoui
We consider the problem of minimizing a convex function that is evolving according to unknown and possibly stochastic dynamics, which may depend jointly on time and on the decision variable itself.
no code implementations • 17 Jun 2021 • Damek Davis, Mateo Díaz, Dmitriy Drusvyatskiy
The main conclusion is that a variety of algorithms for nonsmooth optimization can escape strict saddle points of the Moreau envelope at a controlled rate.
no code implementations • NeurIPS 2021 • Joshua Cutler, Dmitriy Drusvyatskiy, Zaid Harchaoui
We consider the problem of minimizing a convex function that is evolving in time according to unknown and possibly stochastic dynamics.
no code implementations • 16 Dec 2019 • Damek Davis, Dmitriy Drusvyatskiy
We introduce a geometrically transparent strict saddle property for nonsmooth functions.
no code implementations • 31 Jul 2019 • Damek Davis, Dmitriy Drusvyatskiy, Lin Xiao, Junyu Zhang
Standard results in stochastic convex optimization bound the number of samples that an algorithm needs to generate a point with small function value in expectation.
1 code implementation • 22 Jul 2019 • Damek Davis, Dmitriy Drusvyatskiy, Vasileios Charisopoulos
In this work, we ask whether geometric step decay similarly improves stochastic algorithms for the class of sharp nonconvex problems.
no code implementations • 22 Apr 2019 • Vasileios Charisopoulos, Yudong Chen, Damek Davis, Mateo Díaz, Lijun Ding, Dmitriy Drusvyatskiy
The task of recovering a low-rank matrix from its noisy linear measurements plays a central role in computational science.
1 code implementation • 6 Jan 2019 • Vasileios Charisopoulos, Damek Davis, Mateo Díaz, Dmitriy Drusvyatskiy
The blind deconvolution problem seeks to recover a pair of vectors from a set of rank one bilinear measurements.
no code implementations • 17 Oct 2018 • Damek Davis, Dmitriy Drusvyatskiy
We investigate the stochastic optimization problem of minimizing population risk, where the loss defining the risk is assumed to be weakly convex.
no code implementations • 1 Jul 2018 • Damek Davis, Dmitriy Drusvyatskiy, Kellie J. MacPhee
Given a nonsmooth, nonconvex minimization problem, we consider algorithms that iteratively sample and minimize stochastic convex models of the objective function.
1 code implementation • 20 Apr 2018 • Damek Davis, Dmitriy Drusvyatskiy, Sham Kakade, Jason D. Lee
This work considers the question: what convergence guarantees does the stochastic subgradient method have in the absence of smoothness and convexity?
no code implementations • 17 Mar 2018 • Damek Davis, Dmitriy Drusvyatskiy
We consider a family of algorithms that successively sample and minimize simple stochastic models of the objective function.
2 code implementations • 8 Feb 2018 • Damek Davis, Dmitriy Drusvyatskiy
We prove that the proximal stochastic subgradient method, applied to a weakly convex problem, drives the gradient of the Moreau envelope to zero at the rate $O(k^{-1/4})$.
2 code implementations • 12 Jun 2017 • Dmitriy Drusvyatskiy, Henry Wolkowicz
Slater's condition -- existence of a "strictly feasible solution" -- is a common assumption in conic optimization.
Optimization and Control
no code implementations • 31 Mar 2017 • Courtney Paquette, Hongzhou Lin, Dmitriy Drusvyatskiy, Julien Mairal, Zaid Harchaoui
We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions.