Search Results for author: Zhibin Gou

Found 9 papers, 6 papers with code

Rho-1: Not All Tokens Are What You Need

2 code implementations • 11 Apr 2024 • Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan, Weizhu Chen

After fine-tuning, Rho-1-1B and 7B achieved state-of-the-art results of 40. 6% and 51. 8% on MATH dataset, respectively - matching DeepSeekMath with only 3% of the pretraining tokens.

Continual Pretraining Language Modelling +1

217

Paper
Code

Exploring the Mystery of Influential Data for Mathematical Reasoning

no code implementations • 1 Apr 2024 • Xinzhe Ni, Yeyun Gong, Zhibin Gou, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

Additionally, we showcase the use of QaDS in creating efficient fine-tuning mixtures with various selection ratios, and analyze the quality of a wide range of open-source datasets, which can perform as a reference for future works on mathematical reasoning tasks.

Math Mathematical Reasoning

Paper
Add Code

Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning

no code implementations • 4 Mar 2024 • Yiming Huang, Xiao Liu, Yeyun Gong, Zhibin Gou, Yelong Shen, Nan Duan, Weizhu Chen

Large language models (LLMs) have shown great potential in complex reasoning tasks, yet their performance is often hampered by the scarcity of high-quality and reasoning-focused training datasets.

Ranked #29 on Math Word Problem Solving on MATH

GSM8K Math +1

Paper
Add Code

CriticBench: Benchmarking LLMs for Critique-Correct Reasoning

1 code implementation • 22 Feb 2024 • Zicheng Lin, Zhibin Gou, Tian Liang, Ruilin Luo, Haowei Liu, Yujiu Yang

Utilizing CriticBench, we evaluate and dissect the performance of 17 LLMs in generation, critique, and correction reasoning, i. e., GQC reasoning.

Benchmarking

Paper
Code

SciAgent: Tool-augmented Language Models for Scientific Reasoning

no code implementations • 18 Feb 2024 • Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen

To make this task more practical and solvable for LLMs, we introduce a new task setting named tool-augmented scientific reasoning.

Paper
Add Code

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

1 code implementation • 29 Sep 2023 • Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen

Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics.

Ranked #10 on Math Word Problem Solving on MATH (using extra training data)

Arithmetic Reasoning Computational Efficiency +3

819

Paper
Code

MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction

1 code implementation • 22 May 2023 • Zhibin Gou, Qingyan Guo, Yujiu Yang

Generative methods greatly promote aspect-based sentiment analysis via generating a sequence of sentiment elements in a specified format.

Ranked #1 on Aspect-Based Sentiment Analysis (ABSA) on ACOS

Aspect-Based Sentiment Analysis Aspect Category Detection +10

Paper
Code

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

1 code implementation • 19 May 2023 • Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

Unlike these models, humans typically utilize external tools to cross-check and refine their initial content, like using a search engine for fact-checking, or a code interpreter for debugging.

Fact Checking Natural Questions +4

616

Paper
Code

Long Time No See! Open-Domain Conversation with Long-Term Persona Memory

1 code implementation • Findings (ACL) 2022 • Xinchao Xu, Zhibin Gou, Wenquan Wu, Zheng-Yu Niu, Hua Wu, Haifeng Wang, Shihang Wang

Most of the open-domain dialogue models tend to perform poorly in the setting of long-term human-bot conversations.

Dialogue Generation Management

1,694

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.