Search Results for author: Xuan Zhan

Found 1 papers, 0 papers with code

Optimizing Large Model Training through Overlapped Activation Recomputation

no code implementations13 Jun 2024 Ping Chen, Wenjie Zhang, Shuibing He, Yingjie Gu, Zhuwei Peng, Kexin Huang, Xuan Zhan, Weijian Chen, Yi Zheng, Zhefeng Wang, Yanlong Yin, Gang Chen

Our comprehensive evaluation using GPT models with 1. 3B-20B parameters shows that both OPT and HEU outperform the state-of-the-art recomputation approaches (e. g., Megatron-LM and Checkmake) by 1. 02-1. 53x.

Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.