no code implementations • 9 Feb 2024 • Liangyu Zhao, Saeed Maleki, Ziyue Yang, Hossein Pourreza, Aashaka Shah, Changho Hwang, Arvind Krishnamurthy
ForestColl also outperforms other state-of-the-art schedule generation techniques with both up to 61\% more efficient generated schedules and orders of magnitude faster schedule generation speed.
1 code implementation • 27 Oct 2023 • Houwen Peng, Kan Wu, Yixuan Wei, Guoshuai Zhao, Yuxiang Yang, Ze Liu, Yifan Xiong, Ziyue Yang, Bolin Ni, Jingcheng Hu, Ruihang Li, Miaosen Zhang, Chen Li, Jia Ning, Ruizhe Wang, Zheng Zhang, Shuguang Liu, Joe Chau, Han Hu, Peng Cheng
In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).
no code implementations • 15 Mar 2023 • Haonan Zhong, Jiamin Chang, Ziyue Yang, Tingmin Wu, Pathum Chamikara Mahawaga Arachchige, Chehara Pathmabandu, Minhui Xue
toxic images) and models (resp.
2 code implementations • 7 Jun 2022 • Changho Hwang, Wei Cui, Yifan Xiong, Ziyue Yang, Ze Liu, Han Hu, Zilong Wang, Rafael Salas, Jithin Jose, Prabhat Ram, Joe Chau, Peng Cheng, Fan Yang, Mao Yang, Yongqiang Xiong
On efficiency, Flex accelerates SwinV2-MoE, achieving up to 1. 55x and 2. 11x speedup in training and inference over Fairseq, respectively.
1 code implementation • 11 Jun 2021 • Yuanchun Li, Ziqi Zhang, Bingyan Liu, Ziyue Yang, Yunxin Liu
The knowledge of a deep learning model may be transferred to a student model, leading to intellectual property infringement or vulnerability propagation.