no code implementations • 30 Mar 2024 • Bo Liu, Lemeng Wu, Lizhang Chen, Kaizhao Liang, Jiaxu Zhu, Chen Liang, Raghuraman Krishnamoorthi, Qiang Liu
The Lion optimizer has been a promising competitor with the AdamW for training large AI models, with advantages on memory, computation, and sample efficiency.
no code implementations • 9 Oct 2023 • Lizhang Chen, Bo Liu, Kaizhao Liang, Qiang Liu
As we can expect from the results of a random search program, Lion incorporates elements from several existing algorithms, including signed momentum, decoupled weight decay, Polak, and Nesterov momentum, but does not fit into any existing category of theoretically grounded optimizers.
no code implementations • 19 Nov 2020 • Shangxi Wu, Jitao Sang, Xian Zhao, Lizhang Chen
Deep learning models suffer from the problem of semantic discontinuity: small perturbations in the input space tend to cause semantic-level interference to the model output.