Common Sense Reasoning

253 papers with code • 24 benchmarks • 52 datasets

Common sense reasoning tasks are intended to require the model to go beyond pattern recognition. Instead, the model should use "common sense" or world knowledge to make inferences.

Most implemented papers

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

microsoft/DeBERTa ICLR 2021

Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks.

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

studio-ousia/luke EMNLP 2020

In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer.

mT5: A massively multilingual pre-trained text-to-text transformer

google-research/multilingual-t5 NAACL 2021

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks.

The "something something" video database for learning and evaluating visual common sense

jayleicn/singularity ICCV 2017

Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification.

Temporal Relational Reasoning in Videos

metalbubble/TRN-pytorch ECCV 2018

Temporal relational reasoning, the ability to link meaningful transformations of objects or entities over time, is a fundamental property of intelligent species.

Finetuned Language Models Are Zero-Shot Learners

google-research/flan ICLR 2022

We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks.

PaLM: Scaling Language Modeling with Pathways

lucidrains/CoCa-pytorch Google Research 2022

To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

vllm-project/vllm 1 Jun 2023

Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).

Mistral 7B

mistralai/mistral-src 10 Oct 2023

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.