3 code implementations • 20 May 2024 • Jian Hu, Xibin Wu, Zilin Zhu, Xianyu, Weixun Wang, Dehao Zhang, Yu Cao
However, unlike pretraining or fine-tuning a single model, scaling reinforcement learning from human feedback (RLHF) for training large language models poses coordination challenges across four models.
no code implementations • 25 May 2023 • Sitian Shen, Zilin Zhu, Linqian Fan, Harry Zhang, Xinxiao wu
Large pre-trained models have had a significant impact on computer vision by enabling multi-modal learning, where the CLIP model has achieved impressive results in image classification, object detection, and semantic segmentation.
no code implementations • 21 Sep 2022 • Hui Su, Xiao Zhou, Houjin Yu, Xiaoyu Shen, YuWen Chen, Zilin Zhu, Yang Yu, Jie zhou
Large Language Models pre-trained with self-supervised learning have demonstrated impressive zero-shot generalization capabilities on a wide spectrum of tasks.
1 code implementation • 12 Aug 2021 • Jiarui Fang, Zilin Zhu, Shenggui Li, Hui Su, Yang Yu, Jie zhou, Yang You
PatrickStar uses the CPU-GPU heterogeneous memory space to store the model data.
no code implementations • 20 Oct 2020 • Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, Rui Lan, Xianbin Ouyang, Yan Zhang, Jieqian Wei, Jing Gong, Weiliang Lin, Ping Gao, Peng Meng, Xiaomin Xu, Chenyang Guo, Bo Yang, Zhibo Chen, Yongjian Wu, Xiaowen Chu
Distributed training techniques have been widely deployed in large-scale deep neural networks (DNNs) training on dense-GPU clusters.