no code implementations • 6 Mar 2023 • Tolga Ergen, Halil Ibrahim Gulluk, Jonathan Lacotte, Mert Pilanci
We first show that regularized deep threshold network training problems can be equivalently formulated as a standard convex optimization problem, which parallels the LASSO method, provided that the last hidden layer width exceeds a certain threshold.
no code implementations • ICLR 2022 • Yifei Wang, Jonathan Lacotte, Mert Pilanci
As additional consequences of our convex perspective, (i) we establish that Clarke stationary points found by stochastic gradient descent correspond to the global optimum of a subsampled convex problem (ii) we provide a polynomial-time algorithm for checking if a neural network is a global minimum of the training loss (iii) we provide an explicit construction of a continuous path between any neural network and the global minimum of its sublevel set and (iv) characterize the minimal size of the hidden layer so that the neural network optimization landscape has no spurious valleys.
1 code implementation • NeurIPS 2021 • Michał Dereziński, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
no code implementations • 15 May 2021 • Jonathan Lacotte, Yifei Wang, Mert Pilanci
Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix.
1 code implementation • 29 Apr 2021 • Jonathan Lacotte, Mert Pilanci
We propose an adaptive mechanism to control the sketch size according to the progress made in each step of the iterative solver.
no code implementations • 13 Dec 2020 • Jonathan Lacotte, Mert Pilanci
We propose novel randomized optimization methods for high-dimensional convex problems based on restrictions of variables to random subspaces.
no code implementations • NeurIPS 2020 • Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci
These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.
no code implementations • 10 Jun 2020 • Yifei Wang, Jonathan Lacotte, Mert Pilanci
As additional consequences of our convex perspective, (i) we establish that Clarke stationary points found by stochastic gradient descent correspond to the global optimum of a subsampled convex problem (ii) we provide a polynomial-time algorithm for checking if a neural network is a global minimum of the training loss (iii) we provide an explicit construction of a continuous path between any neural network and the global minimum of its sublevel set and (iv) characterize the minimal size of the hidden layer so that the neural network optimization landscape has no spurious valleys.
no code implementations • NeurIPS 2020 • Jonathan Lacotte, Mert Pilanci
Our method starts with an initial embedding dimension equal to 1 and, over iterations, increases the embedding dimension up to the effective one at most.
no code implementations • ICML 2020 • Jonathan Lacotte, Mert Pilanci
Then, we propose a new algorithm by optimizing the computational complexity over the choice of the sketching dimension.
no code implementations • 3 Feb 2020 • Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci
These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.
no code implementations • NeurIPS 2019 • Jonathan Lacotte, Mert Pilanci, Marco Pavone
We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces.
no code implementations • 13 Aug 2018 • Jonathan Lacotte, Mohammad Ghavamzadeh, Yin-Lam Chow, Marco Pavone
We then derive two different versions of our RS-GAIL optimization problem that aim at matching the risk profiles of the agent and the expert w. r. t.
1 code implementation • 28 Nov 2017 • Sumeet Singh, Jonathan Lacotte, Anirudha Majumdar, Marco Pavone
The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i. e., that humans are risk neutral.