StrategyQA

20 papers with code • 0 benchmarks • 0 datasets

StrategyQA aims to measure the ability of models to answer questions that require multi-step implicit reasoning.

Source: BIG-bench

Most implemented papers

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

allenai/dolma NA 2021

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Self-Consistency Improves Chain of Thought Reasoning in Language Models

codelion/optillm 21 Mar 2022

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks.

Improving Planning with Large Language Models: A Modular Agentic Architecture

MAPLLM/MAPICLR2025sub 30 Sep 2023

To improve planning with LLMs, we propose an agentic architecture, the Modular Agentic Planner (MAP), in which planning is accomplished via the recurrent interaction of the specialized modules mentioned above, each implemented using an LLM.

Training Compute-Optimal Large Language Models

karpathy/llama2.c 29 Mar 2022

We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget.

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

zhentingqi/rstar 12 Aug 2024

This paper introduces rStar, a self-play mutual reasoning approach that significantly improves reasoning capabilities of small language models (SLMs) without fine-tuning or superior models.

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

eladsegal/strategyqa 6 Jan 2021

A key limitation in current datasets for multi-hop reasoning is that the required steps for answering the question are mentioned in it explicitly.

Distilling Reasoning Capabilities into Smaller Language Models

kumar-shridhar/distiiling-lm 1 Dec 2022

In this work, we propose an alternative reasoning scheme, Socratic CoT, that learns a decomposition of the original problem into a sequence of subproblems and uses it to guide the intermediate reasoning steps.

Visconde: Multi-document QA with GPT-3 and Neural Reranking

neuralmind-ai/visconde 19 Dec 2022

This paper proposes a question-answering system that can answer questions whose supporting evidence is spread over multiple (potentially long) documents.

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

nardien/kard NeurIPS 2023

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge.