no code implementations • NeurIPS 2009 • Chun-Nan Hsu, Yu-Ming Chang, Hanshen Huang, Yuh-Jye Lee
It has been established that the second-order stochastic gradient descent (2SGD) method can potentially achieve generalization performance as well as empirical optimum in a single pass (i. e., epoch) through the training examples.