no code implementations • 19 Dec 2023 • Satoki Ishikawa, Ryo Karakida
Second-order optimization has been developed to accelerate the training of deep neural networks and it is being applied to increasingly larger-scale models.
2 code implementations • 8 May 2023 • Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler
Gradient preconditioning is a key technique to integrate the second-order information into gradients for improving and extending gradient-based learning algorithms.