Search Results for author: Hengxu Yu

Found 2 papers, 1 papers with code

BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models

1 code implementation • 3 Apr 2024 • Qijun Luo, Hengxu Yu, Xiao Li

This work presents BAdam, an optimizer that leverages the block coordinate optimization framework with Adam as the inner solver.

130

Paper
Code

High Probability Guarantees for Random Reshuffling

no code implementations • 20 Nov 2023 • Hengxu Yu, Xiao Li

This criterion is guaranteed to be triggered after a finite number of iterations, and then $\mathsf{RR}$-$\mathsf{sc}$ returns an iterate with its gradient below $\varepsilon$ with high probability.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.