L-SR1 Adaptive Regularization by Cubics for Deep Learning

no code implementations29 Sep 2021 Aditya Ranganath, Mukesh Singhal, Roummel Marcia

To avoid these points, directions of negative curvature can be utilized, which requires computing the second-derivative matrix.

Computational Efficiency Deep Learning

