Search Results for author: Chengpeng Li

Found 4 papers, 3 papers with code

Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation

no code implementations25 Oct 2023 Chengpeng Li, Zhengyi Yang, Jizhi Zhang, Jiancan Wu, Dingxian Wang, Xiangnan He, Xiang Wang

Therefore, the data sparsity issue of reward signals and state transitions is very severe, while it has long been overlooked by existing RL recommenders. Worse still, RL methods learn through the trial-and-error mode, but negative feedback cannot be obtained in implicit feedback recommendation tasks, which aggravates the overestimation problem of offline RL recommender.

Contrastive Learning Offline RL +3

Query and Response Augmentation Cannot Help Out-of-domain Math Reasoning Generalization

1 code implementation9 Oct 2023 Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou

In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data augmentation are more effective; (2) What is the scaling relationship between the amount of augmented data and model performance; and (3) Can data augmentation incentivize generalization to out-of-domain mathematical reasoning tasks?

Ranked #50 on Math Word Problem Solving on MATH (using extra training data)

Arithmetic Reasoning Data Augmentation +3

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

2 code implementations9 Oct 2023 Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou

We propose four intriguing research questions to explore the association between model performance and various factors including data amount, composition ratio, model size and SFT strategies.

Code Generation Instruction Following +2

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

1 code implementation3 Aug 2023 Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Keming Lu, Chuanqi Tan, Chang Zhou, Jingren Zhou

We find with augmented samples containing more distinct reasoning paths, RFT improves mathematical reasoning performance more for LLMs.

Ranked #100 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +1

Cannot find the paper you are looking for? You can Submit a new open access paper.