Search Results for author: Yanshi Li

Found 3 papers, 1 papers with code

ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models

1 code implementation27 Feb 2025 Haibin Chen, Kangtao Lv, Chengwei Hu, Yanshi Li, Yujin Yuan, Yancheng He, Xingyao Zhang, Langming Liu, Shilei Liu, Wenbo Su, Bo Zheng

To address these problems, we propose \textbf{ChineseEcomQA}, a scalable question-answering benchmark focused on fundamental e-commerce concepts.

Question Answering RAG +1

Equilibrate RLHF: Towards Balancing Helpfulness-Safety Trade-off in Large Language Models

no code implementations17 Feb 2025 Yingshui Tan, Yilei Jiang, Yanshi Li, Jiaheng Liu, Xingyuan Bu, Wenbo Su, Xiangyu Yue, Xiaoyong Zhu, Bo Zheng

Fine-tuning large language models (LLMs) based on human preferences, commonly achieved through reinforcement learning from human feedback (RLHF), has been effective in improving their performance.

Safety Alignment

Adaptive Dense Reward: Understanding the Gap Between Action and Reward Space in Alignment

no code implementations23 Oct 2024 Yanshi Li, Shaopan Xiong, Gengru Chen, Xiaoyang Li, Yijia Luo, Xingyao Zhang, Yanhui Huang, Xingyuan Bu, Yingshui Tan, Chun Yuan, Jiamang Wang, Wenbo Su, Bo Zheng

Our method improves the success rate on adversarial samples by 10\% compared to the sample-wise approach, and achieves a 1. 3\% improvement on evaluation benchmarks such as MMLU, GSM8K, HumanEval, etc.

GSM8K HumanEval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.