Reasoning

StrategyQA

13 papers with code • 0 benchmarks • 0 datasets

StrategyQA aims to measure the ability of models to answer questions that require multi-step implicit reasoning.

Source: BIG-bench

Benchmarks

Add a Result

These leaderboards are used to track progress in StrategyQA

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Most implemented papers

Most implemented Social Latest No code

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch • • Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

Paper
Code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

allenai/dolma • NA 2021

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Paper
Code

Training Compute-Optimal Large Language Models

karpathy/llama2.c • • 29 Mar 2022

We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget.

Paper
Code

Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies

eladsegal/strategyqa • 6 Jan 2021

A key limitation in current datasets for multi-hop reasoning is that the required steps for answering the question are mentioned in it explicitly.

Paper
Code

Self-Consistency Improves Chain of Thought Reasoning in Language Models

lastmile-ai/aiconfig • 21 Mar 2022

Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks.

Paper
Code

Distilling Reasoning Capabilities into Smaller Language Models

kumar-shridhar/distiiling-lm • 1 Dec 2022

In this work, we propose an alternative reasoning scheme, Socratic CoT, that learns a decomposition of the original problem into a sequence of subproblems and uses it to guide the intermediate reasoning steps.

Paper
Code

Visconde: Multi-document QA with GPT-3 and Neural Reranking

neuralmind-ai/visconde • 19 Dec 2022

This paper proposes a question-answering system that can answer questions whose supporting evidence is spread over multiple (potentially long) documents.

Paper
Code

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

nardien/kard • • NeurIPS 2023

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge.

Paper
Code

Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

timhartill/unseen_questions • • 2 Aug 2023

We equip a smaller Language Model to generalise to answering challenging compositional questions that have not been seen in training.

Paper
Code

Tailoring Self-Rationalizers with Multi-Reward Distillation

ink-usc/rationalemultirewarddistillation • • 6 Nov 2023

Results on five difficult question-answering datasets StrategyQA, QuaRel, OpenBookQA, NumerSense and QASC show that not only does MaRio improve task accuracy, but it also improves the self-rationalization quality of small LMs across the aforementioned axes better than a supervised fine-tuning (SFT) baseline.

Paper
Code

StrategyQA

Benchmarks Add a Result

Most implemented papers

Content

Benchmarks

Add a Result