Search Results for author: Bingyang Wu

Found 5 papers, 2 papers with code

StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation

no code implementations22 Apr 2025 Yinmin Zhong, Zili Zhang, Xiaoniu Song, Hanpeng Hu, Chao Jin, Bingyang Wu, Nuo Chen, Yukun Chen, Yu Zhou, Changyi Wan, HongYu Zhou, Yimin Jiang, Yibo Zhu, Daxin Jiang

However, in real-world deployments, we observe that the colocated architecture suffers from resource coupling, where the two stages are constrained to use the same resources.

Reinforcement Learning (RL) Scheduling

Optimizing RLHF Training for Large Language Models with Stage Fusion

no code implementations20 Sep 2024 Yinmin Zhong, Zili Zhang, Bingyang Wu, Shengyu Liu, Yukun Chen, Changyi Wan, Hanpeng Hu, Lei Xia, Ranchen Ming, Yibo Zhu, Xin Jin

Due to the intrinsic nature of RLHF training, i. e., the data skewness in the generation stage and the pipeline bubbles in the training stage, existing RLHF systems suffer from low GPU utilization.

LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism

1 code implementation15 Apr 2024 Bingyang Wu, Shengyu Liu, Yinmin Zhong, Peng Sun, Xuanzhe Liu, Xin Jin

The context window of large language models (LLMs) is rapidly increasing, leading to a huge variance in resource usage between different requests as well as between different phases of the same request.

A Survey of Resource-efficient LLM and Multimodal Foundation Models

1 code implementation16 Jan 2024 Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, QiPeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu

Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment.

Survey

Cannot find the paper you are looking for? You can Submit a new open access paper.