Search Results for author: Yanhui Huang

Found 2 papers, 0 papers with code

Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment

no code implementations23 Oct 2024 Yanshi Li, Shaopan Xiong, Gengru Chen, Xiaoyang Li, Yijia Luo, Xingyao Zhang, Yanhui Huang, Xingyuan Bu, Yingshui Tan, Chun Yuan, Jiamang Wang, Wenbo Su, Bo Zheng

Our method improves the success rate on adversarial samples by 10\% compared to the sample-wise approach, and achieves a 1. 3\% improvement on evaluation benchmarks such as MMLU, GSM8K, HumanEval, etc.

GSM8K HumanEval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.