Search Results for author: Jiangming Shu

Found 2 papers, 2 papers with code

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

1 code implementation22 Dec 2024 Yuxiang Zhang, YuQi Yang, Jiangming Shu, Yuhang Wang, Jinlin Xiao, Jitao Sang

OpenAI's recent introduction of Reinforcement Fine-Tuning (RFT) showcases the potential of reasoning foundation model and offers a new paradigm for fine-tuning beyond simple pattern imitation.

o1-Coder: an o1 Replication for Coding

1 code implementation29 Nov 2024 Yuxiang Zhang, Shangxi Wu, YuQi Yang, Jiangming Shu, Jinlin Xiao, Chao Kong, Jitao Sang

The technical report introduces O1-CODER, an attempt to replicate OpenAI's o1 model with a focus on coding tasks.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.