no code implementations • 27 Jul 2023 • Runzhe Wang, Sadhika Malladi, Tianhao Wang, Kaifeng Lyu, Zhiyuan Li
Momentum is known to accelerate the convergence of gradient descent in strongly convex settings without stochastic gradient noise.
no code implementations • NeurIPS 2021 • Kaifeng Lyu, Zhiyuan Li, Runzhe Wang, Sanjeev Arora
The current paper is able to establish this global optimality for two-layer Leaky ReLU nets trained with gradient flow on linearly separable and symmetric data, regardless of the width.
no code implementations • NeurIPS 2021 • Baihe Huang, Kaixuan Huang, Sham M. Kakade, Jason D. Lee, Qi Lei, Runzhe Wang, Jiaqi Yang
While the theory of RL has traditionally focused on linear function approximation (or eluder dimension) approaches, little is known about nonlinear RL with neural net approximations of the Q functions.
no code implementations • NeurIPS 2021 • Baihe Huang, Kaixuan Huang, Sham M. Kakade, Jason D. Lee, Qi Lei, Runzhe Wang, Jiaqi Yang
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave, including the low-rank generalized linear bandit problems and two-layer neural network with polynomial activation bandit problem.
no code implementations • 26 Sep 2019 • Rong Ge, Runzhe Wang, Haoyu Zhao
It has been observed \citep{zhang2016understanding} that deep neural networks can memorize: they achieve 100\% accuracy on training data.