Mathematical Reasoning

110 papers with code • 5 benchmarks • 15 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Mathematical Reasoning models and implementations
2 papers
16,369

Most implemented papers

PAL: Program-aided Language Models

srush/minichain 18 Nov 2022

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Reasoning with Language Model Prompting: A Survey

zjunlp/Prompt4ReasoningPapers 19 Dec 2022

Reasoning, as an essential ability for complex problem-solving, can provide back-end support for various real-world applications, such as medical diagnosis, negotiation, etc.

Mathematical Capabilities of ChatGPT

snfrieder/ghosts NeurIPS 2023

We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology.

Sparks of Artificial General Intelligence: Early experiments with GPT-4

microsoft/guidance 22 Mar 2023

We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.

Self-Refine: Iterative Refinement with Self-Feedback

jina-ai/thinkgpt NeurIPS 2023

Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement.

SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training

deep-symbolic-mathematics/Multimodal-Math-Pretraining 3 Oct 2023

To bridge the gap, we introduce SNIP, a Symbolic-Numeric Integrated Pre-training model, which employs contrastive learning between symbolic and numeric domains, enhancing their mutual similarities in the embeddings.

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

ofa-sys/gsm8k-screl 9 Oct 2023

We propose four intriguing research questions to explore the association between model performance and various factors including data amount, composition ratio, model size and SFT strategies.

Autonomous Data Selection with Language Models for Mathematical Texts

hiyouga/llama-factory 12 Feb 2024

Our method showcases a 2 times increase in pretraining token efficiency compared to state-of-the-art baselines, underscoring the potential of our approach in enhancing models' mathematical reasoning capabilities.

Evaluating Mathematical Reasoning Beyond Accuracy

gair-nlp/reasoneval 8 Apr 2024

To measure reasoning beyond final-answer accuracy, we introduce ReasonEval, a new methodology for evaluating the quality of reasoning steps.

Learning to Prove Theorems via Interacting with Proof Assistants

princeton-vl/CoqGym 21 May 2019

Proof assistants offer a formalism that resembles human mathematical reasoning, representing theorems in higher-order logic and proofs as high-level tactics.