GSM8K

152 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find GSM8K models and implementations
2 papers
19,672
2 papers
4,870
2 papers
3,334

Datasets


Most implemented papers

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

microsoft/guidance 28 Jan 2022

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

thudm/chatglm-6b 18 Jun 2024

We introduce ChatGLM, an evolving family of large language models that we have been developing over time.

Training Verifiers to Solve Math Word Problems

openai/grade-school-math 27 Oct 2021

State-of-the-art language models can match human performance on many tasks, but they still struggle to robustly perform multi-step mathematical reasoning.

Qwen2 Technical Report

qwenlm/qwen2 15 Jul 2024

This report introduces the Qwen2 series, the latest addition to our large language models and large multimodal models.

Large Language Models are Zero-Shot Reasoners

kojima-takeshi188/zero_shot_cot 24 May 2022

Pretrained large language models (LLMs) are widely used in many sub-fields of natural language processing (NLP) and generally known as excellent few-shot learners with task-specific exemplars.

Language Models are Multilingual Chain-of-Thought Reasoners

google-research/url-nlp 6 Oct 2022

Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.

Self-Consistency Improves Chain of Thought Reasoning in Language Models

codelion/optillm 21 Mar 2022

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks.

PAL: Program-aided Language Models

srush/minichain 18 Nov 2022

Much of this success can be attributed to prompting methods such as "chain-of-thought'', which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem.

Matrix Information Theory for Self-Supervised Learning

yifanzhang-pro/matrix-ssl 27 May 2023

Inspired by this framework, we introduce Matrix-SSL, a novel approach that leverages matrix information theory to interpret the maximum entropy encoding loss as matrix uniformity loss.

AskIt: Unified Programming Interface for Programming with Large Language Models

katsumiok/pyaskit 29 Aug 2023

Developers face decisions regarding the use of LLMs for directly performing tasks within applications as well as for generating and executing code to accomplish these tasks.