no code implementations • 28 Sep 2020 • Oren Mangoubi, Sushant Sachdeva, Nisheeth K Vishnoi
We present a first-order algorithm for nonconvex-nonconcave min-max optimization problems such as those that arise in training GANs.
1 code implementation • NeurIPS 2020 • Xuchan Bao, James Lucas, Sushant Sachdeva, Roger Grosse
Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs).
1 code implementation • ICML 2020 • Matthew Fahrbach, Gramoz Goranci, Richard Peng, Sushant Sachdeva, Chi Wang
As computing Schur complements is expensive, we give a nearly-linear time algorithm that generates a coarsened graph on the relevant vertices that provably matches the Schur complement in expectation in each iteration.
2 code implementations • 22 Jun 2020 • Vijay Keswani, Oren Mangoubi, Sushant Sachdeva, Nisheeth K. Vishnoi
The equilibrium point found by our algorithm depends on the proposal distribution, and when applying our algorithm to train GANs we choose the proposal distribution to be a distribution of stochastic gradients.
1 code implementation • NeurIPS 2019 • Deeksha Adil, Richard Peng, Sushant Sachdeva
However, these algorithms often diverge for p > 3, and since the work of Osborne (1985), it has been an open problem whether there is an IRLS algorithm that is guaranteed to converge rapidly for p > 3.
1 code implementation • NeurIPS 2019 • Guodong Zhang, Lala Li, Zachary Nado, James Martens, Sushant Sachdeva, George E. Dahl, Christopher J. Shallue, Roger Grosse
Increasing the batch size is a popular way to speed up neural network training, but beyond some critical batch size, larger batch sizes yield diminishing returns.
no code implementations • 21 Jan 2019 • Deeksha Adil, Rasmus Kyng, Richard Peng, Sushant Sachdeva
We give improved algorithms for the $\ell_{p}$-regression problem, $\min_{x} \|x\|_{p}$ such that $A x=b,$ for all $p \in (1, 2) \cup (2,\infty).$ Our algorithms obtain a high accuracy solution in $\tilde{O}_{p}(m^{\frac{|p-2|}{2p + |p-2|}}) \le \tilde{O}_{p}(m^{\frac{1}{3}})$ iterations, where each iteration requires solving an $m \times m$ linear system, $m$ being the dimension of the ambient space.
no code implementations • 1 Feb 2017 • Rina Panigrahy, Sushant Sachdeva, Qiuyi Zhang
Iterating, we show that gradient descent can be used to learn the entire network one node at a time.
1 code implementation • NeurIPS 2015 • Rasmus Kyng, Anup Rao, Sushant Sachdeva
Given a directed acyclic graph $G,$ and a set of values $y$ on the vertices, the Isotonic Regression of $y$ is a vector $x$ that respects the partial order described by $G,$ and minimizes $\|x-y\|,$ for a specified norm.
1 code implementation • 2 Jul 2015 • Rasmus Kyng, Anup Rao, Sushant Sachdeva
Given a directed acyclic graph $G,$ and a set of values $y$ on the vertices, the Isotonic Regression of $y$ is a vector $x$ that respects the partial order described by $G,$ and minimizes $||x-y||,$ for a specified norm.
1 code implementation • 1 May 2015 • Rasmus Kyng, Anup Rao, Sushant Sachdeva, Daniel A. Spielman
We develop fast algorithms for solving regression problems on graphs where one is given the value of a function at some vertices, and must find its smoothest possible extension to all vertices.
no code implementations • NeurIPS 2012 • Sanjeev Arora, Rong Ge, Ankur Moitra, Sushant Sachdeva
We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees.