2 code implementations • 21 Mar 2024 • Zih-Syuan Huang, Ching-pei Lee
We propose a Regularized Adaptive Momentum Dual Averaging (RAMDA) algorithm for training structured neural networks.
2 code implementations • ICLR 2022 • Zih-Syuan Huang, Ching-pei Lee
This paper proposes an algorithm (RMDA) for training neural networks (NNs) with a regularization term for promoting desired structures.
no code implementations • 29 Sep 2021 • Zih-Syuan Huang, Ching-pei Lee
Stochastic gradient descent with momentum (SGD+M) is widely used to empirically improve the convergence behavior and the generalization performance of plain stochastic gradient descent (SGD) in the training of deep learning models, but our theoretical understanding for SGD+M is still very limited.