1 code implementation • NeurIPS 2023 • Yun Yue, Zhiling Ye, Jiadi Jiang, Yongchao Liu, Ke Zhang
Additionally, we introduce an auto-switching function that enables the preconditioning matrix to switch dynamically between Stochastic Gradient Descent (SGD) and the adaptive optimizer.
1 code implementation • 25 May 2023 • Yun Yue, Jiadi Jiang, Zhiling Ye, Ning Gao, Yongchao Liu, Ke Zhang
Deep Neural Networks (DNNs) generalization is known to be closely related to the flatness of minima, leading to the development of Sharpness-Aware Minimization (SAM) for seeking flatter minima and better generalization.