Arithmetic Reasoning

70 papers with code • 2 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Arithmetic Reasoning

Trend	Dataset	Best Model	Paper	Code	Compare
	GSM8K	GPT-4 Code Interpreter (CSV, K=5)			See all
	MultiArith	Text-davinci-002 (175B)(zero-shot-cot)			See all

Libraries

Use these libraries to find Arithmetic Reasoning models and implementations

epfllm/megatron-llm

3 papers

452

huggingface/transformers

2 papers

124,593

squeezeailab/squeezellm

2 papers

560

skytliang/multi-agents-debate

2 papers

170

Datasets

Most implemented papers

Most implemented Social Latest No code

LLaMA: Open and Efficient Foundation Language Models

facebookresearch/llama • • arXiv 2023

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

Paper
Code

Llama 2: Open Foundation and Fine-Tuned Chat Models

facebookresearch/llama • • 18 Jul 2023

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

Paper
Code

GPT-4 Technical Report

openai/evals • Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

Paper
Code

Mistral 7B

mistralai/mistral-src • • 10 Oct 2023

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.

Paper
Code

Llemma: An Open Language Model For Mathematics

eleutherai/gpt-neox • • 16 Oct 2023

We present Llemma, a large language model for mathematics.

Paper
Code

Mastering Symbolic Operations: Augmenting Language Models with Compiled Neural Networks

wengsyx/neural-comprehension • • 4 Apr 2023

Our work highlights the potential of seamlessly unifying explicit rule learning via CoNNs and implicit pattern learning in LMs, paving the way for true symbolic comprehension capabilities.

Paper
Code

Large Language Models are Zero-Shot Reasoners

kojima-takeshi188/zero_shot_cot • • 24 May 2022

Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars.

Paper
Code

PAL: Program-aided Language Models

srush/minichain • • 18 Nov 2022

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Paper
Code

Reasoning with Language Model Prompting: A Survey

zjunlp/Prompt4ReasoningPapers • • 19 Dec 2022

Reasoning, as an essential ability for complex problem-solving, can provide back-end support for various real-world applications, such as medical diagnosis, negotiation, etc.

Paper
Code

Batch Prompting: Efficient Inference with Large Language Model APIs

xlang-ai/batch-prompting • 19 Jan 2023

We extensively validate the effectiveness of batch prompting on ten datasets across commonsense QA, arithmetic reasoning, and NLI/NLU: batch prompting significantly~(up to 5x with six samples in batch) reduces the LLM (Codex) inference token and time costs while achieving better or comparable performance.

Paper
Code

Arithmetic Reasoning

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result