Search Results for author: R. X. Xu

Found 4 papers, 4 papers with code

Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

2 code implementations14 Dec 2023 Peiyi Wang, Lei LI, Zhihong Shao, R. X. Xu, Damai Dai, Yifei Li, Deli Chen, Y. Wu, Zhifang Sui

In this paper, we present an innovative process-oriented math process reward model called \textbf{Math-Shepherd}, which assigns a reward score to each step of math problem solutions.

Ranked #22 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Cannot find the paper you are looking for? You can Submit a new open access paper.