Search Results for author: Xianyu

Found 1 papers, 1 papers with code

OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework

1 code implementation20 May 2024 Jian Hu, Xibin Wu, Weixun Wang, Xianyu, Dehao Zhang, Yu Cao

However, unlike pretraining or fine-tuning a single model, scaling reinforcement learning from human feedback (RLHF) for training large language models poses coordination challenges across four models.

reinforcement-learning Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.