Towards Better Generalization: BP-SVRG in Training Deep Neural Networks

Stochastic variance-reduced gradient (SVRG) is a classical optimization method. Although it is theoretically proved to have better convergence performance than stochastic gradient descent (SGD), the generalization performance of SVRG remains open... (read more)

Results in Papers With Code
(↓ scroll down to see all results)