1 code implementation • 1 Dec 2023 • Lei Guan, Dongsheng Li, Jiye Liang, Wenjian Wang, Xicheng Lu
The key insight of our proposal is that we employ a weight prediction strategy in the forward pass to ensure that each mini-batch uses consistent and staleness-free weights to compute the forward pass.
1 code implementation • 5 Sep 2023 • Lei Guan
This paper proposes an efficient optimizer called AdaPlus which integrates Nesterov momentum and precise stepsize adjustment on AdamW basis.
1 code implementation • 26 May 2023 • Lei Guan, Dongsheng Li, Yanqi Shi, Jian Meng
the future weights to update the DNN parameters, making the gradient-based optimizer achieve better convergence and generalization compared to the original optimizer without weight prediction.
no code implementations • 1 Feb 2023 • Lei Guan
In this paper, we introduce weight prediction into the AdamW optimizer to boost its convergence when training the deep neural network (DNN) models.
no code implementations • 24 Oct 2019 • Lei Guan, Wotao Yin, Dongsheng Li, Xicheng Lu
It allows the overlapping of the pipelines of multiple micro-batches, including those belonging to different mini-batches.
no code implementations • 5 Nov 2018 • Tao Sun, Penghang Yin, Dongsheng Li, Chun Huang, Lei Guan, Hao Jiang
For objective functions satisfying a relaxed strongly convex condition, the linear convergence is established under weaker assumptions on the step size and inertial parameter than made in the existing literature.
no code implementations • 11 Sep 2018 • Lei Guan, Linbo Qiao, Dongsheng Li, Tao Sun, Keshi Ge, Xicheng Lu
Support vector machines (SVMs) with sparsity-inducing nonconvex penalties have received considerable attentions for the characteristics of automatic classification and variable selection.