Search Results for author: Bingrui Li

Found 2 papers, 0 papers with code

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

no code implementations7 Oct 2024 Bingrui Li, Wei Huang, Andi Han, Zhanpeng Zhou, Taiji Suzuki, Jun Zhu, Jianfei Chen

We also show that Adam behaves similarly to SignGD in terms of both optimization and generalization in this setting.

Cannot find the paper you are looking for? You can Submit a new open access paper.