Math Word Problem Solving
63 papers with code • 11 benchmarks • 17 datasets
A math word problem is a mathematical exercise (such as in a textbook, worksheet, or exam) where significant background information on the problem is presented in ordinary language rather than in mathematical notation. As most word problems involve a narrative of some sort, they are sometimes referred to as story problems and may vary in the amount of technical language used.
Libraries
Use these libraries to find Math Word Problem Solving models and implementationsLatest papers
KnowledgeMath: Knowledge-Intensive Math Word Problem Solving in Finance Domains
We introduce KnowledgeMath, a novel benchmark designed to evaluate LLMs' capabilities in applying financial knowledge to solve complex math word problems.
ATHENA: Mathematical Reasoning with Thought Expansion
Solving math word problems depends on how to articulate the problems, the lens through which models view human linguistic expressions.
Mistral 7B
We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.
Query and Response Augmentation Cannot Help Out-of-domain Math Reasoning Generalization
In this paper, we conduct an investigation for such data augmentation in math reasoning and are intended to answer: (1) What strategies of data augmentation are more effective; (2) What is the scaling relationship between the amount of augmented data and model performance; and (3) Can data augmentation incentivize generalization to out-of-domain mathematical reasoning tasks?
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
In this paper, we present a method to fine-tune open-source language models, enabling them to use code for modeling and deriving math equations and, consequently, enhancing their mathematical reasoning abilities.
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics.
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Our MetaMath-7B model achieves 66. 4% on GSM8K and 19. 4% on MATH, exceeding the state-of-the-art models of the same size by 11. 5% and 8. 7%.
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Specifically, we consider the general SFT training data, consisting of a small amount of expert data mixed with a large proportion of sub-optimal data, without any preference labels.
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Through extensive experiments on two mathematical reasoning benchmarks, namely GSM8k and MATH, we reveal the extraordinary capabilities of our model.
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
We found that its success can be largely attributed to its powerful skills in generating and executing code, evaluating the output of code execution, and rectifying its solution when receiving unreasonable outputs.