no code implementations • 17 Sep 2023 • Junjie Zhu, Yiying Li, Chunping Qiu, Ke Yang, Naiyang Guan, Xiaodong Yi
In order to tackle these issues, we turn to the recently proposed parameter-efficient tuning methods, such as VPT, which updates only the newly added prompt parameters while keeping the pre-trained backbone frozen.
no code implementations • 13 Feb 2023 • Shiwei Zhang, Xiaodong Yi, Lansong Diao, Chuan Wu, Siyu Wang, Wei Lin
This paper presents TAG, an automatic system to derive optimized DNN training graph and its deployment onto any device topology, for expedited training in device- and topology- heterogeneous ML clusters.
1 code implementation • 28 Oct 2021 • Jinhui Yuan, Xinqi Li, Cheng Cheng, Juncheng Liu, Ran Guo, Shenghang Cai, Chi Yao, Fei Yang, Xiaodong Yi, Chuan Wu, Haoran Zhang, Jie Zhao
Aiming at a simple, neat redesign of distributed deep learning frameworks for various parallelism paradigms, we present OneFlow, a novel distributed training framework based on an SBP (split, broadcast and partial-value) abstraction and the actor model.