Mathematical Problem-Solving
25 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Mathematical Problem-Solving
Most implemented papers
Measuring Mathematical Problem Solving With the MATH Dataset
To facilitate future research and increase accuracy on MATH, we also contribute a large auxiliary pretraining dataset which helps teach models the fundamentals of mathematics.
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
Large language models (LLMs) have shown excellent mastering of human language, but still struggle in real-world applications that require mathematical problem-solving.
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data
This paper investigates the mathematical problem-solving capabilities of LLMs using the newly developed "MathOdyssey" dataset.
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens.
Abstractors and relational cross-attention: An inductive bias for explicit relational reasoning in Transformers
An extension of Transformers is proposed that enables explicit relational reasoning through a novel module called the Abstractor.
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
The development of large language models (LLMs) such as ChatGPT has brought a lot of attention recently.
Evaluating Language Models for Mathematics through Interactions
There is much excitement about the opportunity to harness the power of large language models (LLMs) when building problem-solving assistants.
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics.
Data Contamination Through the Lens of Time
Recent claims about the impressive abilities of large language models (LLMs) are often supported by evaluating publicly available benchmarks.
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
We first analyze the limitations of current Multimodal Large Language Models (MLLMs) in this area: they struggle to accurately comprehending basic geometric elements and their relationships.