Search Results for author: JianGuang Lou

Found 3 papers, 2 papers with code

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

no code implementations15 Jul 2024 Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, QIngwei Lin, JianGuang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen

In this paper, we introduce Arena Learning, an innovative offline strategy designed to simulate these arena battles using AI-driven annotations to evaluate battle outcomes, thus facilitating the continuous improvement of the target model through both supervised fine-tuning and reinforcement learning.

Chatbot

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

1 code implementation18 Aug 2023 Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, JianGuang Lou, Chongyang Tao, Xiubo Geng, QIngwei Lin, Shifeng Chen, Dongmei Zhang

Through extensive experiments on two mathematical reasoning benchmarks, namely GSM8k and MATH, we reveal the extraordinary capabilities of our model.

Ranked #51 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Cannot find the paper you are looking for? You can Submit a new open access paper.