no code implementations • 7 Mar 2021 • David Newton, Raghu Bollapragada, Raghu Pasupathy, Nung Kwan Yip
Our investigation leads naturally to generalizing SG into Retrospective Approximation (RA) where, during each iteration, a "deterministic solver" executes possibly multiple steps on a subsampled deterministic problem and stops when further solving is deemed unnecessary from the standpoint of statistical efficiency.
no code implementations • 7 Jul 2020 • Yuqing Li, Tao Luo, Nung Kwan Yip
Gradient descent yields zero training loss in polynomial time for deep neural networks despite non-convex nature of the objective function.