Natural Language Processing

GSM8K

67 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in GSM8K

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find GSM8K models and implementations

microsoft/guidance

2 papers

17,174

microsoft/LLMLingua

2 papers

3,623

Most implemented papers

Most implemented Social Latest No code

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

guidance-ai/guidance • 28 Jan 2022

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

Paper
Code

Training Verifiers to Solve Math Word Problems

openai/grade-school-math • • 27 Oct 2021

State-of-the-art language models can match human performance on many tasks, but they still struggle to robustly perform multi-step mathematical reasoning.

Paper
Code

Matrix Information Theory for Self-Supervised Learning

yifanzhang-pro/matrix-ssl • • 27 May 2023

Inspired by this framework, we introduce Matrix-SSL, a novel approach that leverages matrix information theory to interpret the maximum entropy encoding loss as matrix uniformity loss.

Paper
Code

AskIt: Unified Programming Interface for Programming with Large Language Models

katsumiok/pyaskit • 29 Aug 2023

Developers face decisions regarding the use of LLMs for directly performing tasks within applications as well as for generating and executing code to accomplish these tasks.

Paper
Code

Large Language Models are Zero-Shot Reasoners

kojima-takeshi188/zero_shot_cot • • 24 May 2022

Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars.

Paper
Code

Language Models are Multilingual Chain-of-Thought Reasoners

google-research/url-nlp • 6 Oct 2022

Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.

Paper
Code

PAL: Program-aided Language Models

srush/minichain • • 18 Nov 2022

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Paper
Code

Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models

zoeyyao27/graph-of-thought • • 26 May 2023

Therefore, we propose Graph-of-Thought (GoT) reasoning, which models human thought processes not only as a chain but also as a graph.

Paper
Code

Large Language Models as Optimizers

google-deepmind/opro • 7 Sep 2023

In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language.

Paper
Code

MR-GSM8K: A Meta-Reasoning Revolution in Large Language Model Evaluation

dvlab-research/mr-gsm8k • • 28 Dec 2023

In this work, we introduce a novel evaluation paradigm for Large Language Models, one that challenges them to engage in meta-reasoning.

Paper
Code

GSM8K

Benchmarks Add a Result

Libraries

Most implemented papers

Content

Benchmarks

Add a Result