GSM8K

67 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find GSM8K models and implementations
2 papers
17,174
2 papers
3,623

Most implemented papers

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

guidance-ai/guidance 28 Jan 2022

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

Training Verifiers to Solve Math Word Problems

openai/grade-school-math 27 Oct 2021

State-of-the-art language models can match human performance on many tasks, but they still struggle to robustly perform multi-step mathematical reasoning.

Matrix Information Theory for Self-Supervised Learning

yifanzhang-pro/matrix-ssl 27 May 2023

Inspired by this framework, we introduce Matrix-SSL, a novel approach that leverages matrix information theory to interpret the maximum entropy encoding loss as matrix uniformity loss.

AskIt: Unified Programming Interface for Programming with Large Language Models

katsumiok/pyaskit 29 Aug 2023

Developers face decisions regarding the use of LLMs for directly performing tasks within applications as well as for generating and executing code to accomplish these tasks.

Large Language Models are Zero-Shot Reasoners

kojima-takeshi188/zero_shot_cot 24 May 2022

Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars.

Language Models are Multilingual Chain-of-Thought Reasoners

google-research/url-nlp 6 Oct 2022

Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.

PAL: Program-aided Language Models

srush/minichain 18 Nov 2022

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models

zoeyyao27/graph-of-thought 26 May 2023

Therefore, we propose Graph-of-Thought (GoT) reasoning, which models human thought processes not only as a chain but also as a graph.

Large Language Models as Optimizers

google-deepmind/opro 7 Sep 2023

In this work, we propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as optimizers, where the optimization task is described in natural language.

MR-GSM8K: A Meta-Reasoning Revolution in Large Language Model Evaluation

dvlab-research/mr-gsm8k 28 Dec 2023

In this work, we introduce a novel evaluation paradigm for Large Language Models, one that challenges them to engage in meta-reasoning.