Measuring Massive Multitask Language Understanding

hendrycks/test 7 Sep 2020

By comprehensively evaluating the breadth and depth of a model's academic and professional understanding, our test can be used to analyze models across many tasks and to identify important shortcomings.

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

huggingface/transformers NeurIPS 2020

Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks.

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

lianjiatech/belle NeurIPS 2023

Existing methods for gaining such steerability collect human labels of the relative quality of model generations and fine-tune the unsupervised LM to align with these preferences, often with reinforcement learning from human feedback (RLHF).

REALM: Retrieval-Augmented Language Model Pre-Training

google-research/language 10 Feb 2020

Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering.

Imagine This! Scripts to Compositions to Videos

ubc-vision/make-a-story ECCV 2018

Imagining a scene described in natural language with realistic layout and appearance of entities is the ultimate test of spatial, visual, and semantic world knowledge.

Mistral 7B

mistralai/mistral-src 10 Oct 2023

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for superior performance and efficiency.

MEIM: Multi-partition Embedding Interaction Beyond Block Term Format for Efficient and Expressive Link Prediction

tranhungnghiep/meim-kge 30 Sep 2022

Knowledge graph embedding aims to predict the missing relations between entities in knowledge graphs.

CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

jonathanherzig/commonsenseqa NAACL 2019

To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering.

Breaking NLI Systems with Sentences that Require Simple Lexical Inferences

BIU-NLP/Breaking_NLI ACL 2018

We create a new NLI test set that shows the deficiency of state-of-the-art models in inferences that require lexical and world knowledge.

ASER: A Large-scale Eventuality Knowledge Graph

HKUST-KnowComp/ASER 1 May 2019

Understanding human's language requires complex world knowledge.