Common Sense Reasoning

253 papers with code • 24 benchmarks • 52 datasets

Common sense reasoning tasks are intended to require the model to go beyond pattern recognition. Instead, the model should use "common sense" or world knowledge to make inferences.

Benchmarks

Add a Result

These leaderboards are used to track progress in Common Sense Reasoning

Dataset	Best Model	Compare
WinoGrande	ST-MoE-32B 269B (fine-tuned)	See all
ARC (Challenge)	GPT-4 (few-shot, k=25)	See all
ARC (Easy)	ST-MoE-32B 269B (fine-tuned)	See all
CommonsenseQA	DeBERTaV3-large+KEAR	See all
ReCoRD	Turing NLR v5 XXL 5.4B (fine-tuned)	See all
BIG-bench (Disambiguation QA)	PaLM 2 (few-shot, k=3, Direct)	See all
BIG-bench (Causal Judgment)	PaLM 2 (few-shot, k=3, Direct)	See all
BIG-bench (Date Understanding)	PaLM 2 (few-shot, k=3, CoT)	See all
BIG-bench (Sports Understanding)	PaLM 2(few-shot, k=3, CoT)	See all
Event2Mind test	ConvNet	See all
Russian Event2Mind	ruscorpora word2vec (skipgram) + GRU	See all
RuCoS	Human Benchmark	See all
RWSD	Golden Transformer	See all
PARus	Human Benchmark	See all
SWAG	DeBERTalarge	See all
BIG-bench (Winowhy)	PaLM-540B (few-shot, k=5)	See all
BIG-bench (Known Unknowns)	PaLM-540B (few-shot, k=5)	See all
Event2Mind dev	ConvNet	See all
BIG-bench (Logical Sequence)	Chinchilla-70B (few-shot, k=5)	See all
CODAH	BERT Large	See all
Visual Dialog v0.9	PDUN	See all
CrowdSource QA	BERT	See all
Visual Dialog v0.9	NMN [kottur2018visual]	See all
WinoGAViL	ViLT	See all

Show all 24 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Common Sense Reasoning models and implementations

huggingface/transformers

9 papers

124,527

Tencent/TurboTransformers

3 papers

1,440

volcengine/vegiantmodel

3 papers

196

Leeroo-AI/mergoo

3 papers

176

See all 23 libraries.

Datasets

Subtasks

Anachronisms

Discourse Marker Prediction

Visual Commonsense Tests

Multiview Contextual Commonsense Inference

Most implemented papers

Most implemented Social Latest No code

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

microsoft/DeBERTa • • ICLR 2021

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks.

Paper
Code

GPT-4 Technical Report

openai/evals • Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

Paper
Code

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

studio-ousia/luke • • EMNLP 2020

In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer.

Paper
Code

mT5: A massively multilingual pre-trained text-to-text transformer

google-research/multilingual-t5 • • NAACL 2021

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks.

Paper
Code

The "something something" video database for learning and evaluating visual common sense

jayleicn/singularity • • ICCV 2017

Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification.

Paper
Code

Temporal Relational Reasoning in Videos

metalbubble/TRN-pytorch • • ECCV 2018

Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species.

Paper
Code

Finetuned Language Models Are Zero-Shot Learners

google-research/flan • • ICLR 2022

We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks.

Paper
Code

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch • • Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

Paper
Code

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

vllm-project/vllm • • 1 Jun 2023

Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).

Paper
Code

Mistral 7B

mistralai/mistral-src • • 10 Oct 2023

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.

Paper
Code

Common Sense Reasoning

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result