Search Results for author: Shenhan Zhu

Found 1 papers, 1 papers with code

Improving Automatic Parallel Training via Balanced Memory Workload Optimization

1 code implementation5 Jul 2023 Yujie Wang, Youhe Jiang, Xupeng Miao, Fangcheng Fu, Shenhan Zhu, Xiaonan Nie, Yaofeng Tu, Bin Cui

Transformer models have emerged as the leading approach for achieving state-of-the-art performance across various application domains, serving as the foundation for advanced large-scale deep learning (DL) models.

Navigate

Cannot find the paper you are looking for? You can Submit a new open access paper.