TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Named Entity Recognition (NER)	CoNLL 2003 (English)	LUKE 483M	F1	94.3	# 2
Entity Typing	Open Entity	MLMET	F1	78.2	# 1
Common Sense Reasoning	ReCoRD	LUKE 483M	F1	91.2	# 12
Common Sense Reasoning	ReCoRD	LUKE 483M	EM	90.6	# 10
Question Answering	SQuAD1.1	LUKE (single model)	EM	90.202	# 2
Question Answering	SQuAD1.1	LUKE (single model)	F1	95.379	# 3
Question Answering	SQuAD1.1	LUKE 483M	F1	95.4	# 2
Question Answering	SQuAD1.1	LUKE	EM	90.2	# 4
Question Answering	SQuAD1.1 dev	LUKE 483M	F1	95	# 4
Question Answering	SQuAD1.1 dev	LUKE	EM	89.8	# 2
Question Answering	SQuAD2.0	LUKE (single model)	EM	87.429	# 83
Question Answering	SQuAD2.0	LUKE (single model)	F1	90.163	# 84
Question Answering	SQuAD2.0	LUKE 483M	F1	90.2	# 83
Relation Classification	TACRED	LUKE 483M	F1	72.7	# 14
Relation Extraction	TACRED	LUKE	F1 (1% Few-Shot)	17.0	# 4
Relation Extraction	TACRED	LUKE	F1 (5% Few-Shot)	51.6	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/entity-typing-on-open-entity-1)](https://paperswithcode.com/sota/entity-typing-on-open-entity-1?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/named-entity-recognition-ner-on-conll-2003)](https://paperswithcode.com/sota/named-entity-recognition-ner-on-conll-2003?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/question-answering-on-squad11)](https://paperswithcode.com/sota/question-answering-on-squad11?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/question-answering-on-squad11-dev)](https://paperswithcode.com/sota/question-answering-on-squad11-dev?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/relation-extraction-on-tacred)](https://paperswithcode.com/sota/relation-extraction-on-tacred?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/common-sense-reasoning-on-record)](https://paperswithcode.com/sota/common-sense-reasoning-on-record?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/relation-classification-on-tacred-1)](https://paperswithcode.com/sota/relation-classification-on-tacred-1?p=luke-deep-contextualized-entity)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/luke-deep-contextualized-entity/question-answering-on-squad20)](https://paperswithcode.com/sota/question-answering-on-squad20?p=luke-deep-contextualized-entity)`

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

EMNLP 2020 · Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto ·

Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task based on the masked language model of BERT. The task involves predicting randomly masked words and entities in a large entity-annotated corpus retrieved from Wikipedia. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer, and considers the types of tokens (words or entities) when computing attention scores. The proposed model achieves impressive empirical performance on a wide range of entity-related tasks. In particular, it obtains state-of-the-art results on five well-known datasets: Open Entity (entity typing), TACRED (relation classification), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), and SQuAD 1.1 (extractive question answering). Our source code and pretrained representations are available at https://github.com/studio-ousia/luke.

PDF Abstract EMNLP 2020 PDF EMNLP 2020 Abstract

Code

Add Remove Mark official

studio-ousia/luke official

↳ Quickstart in

Colab

683

huggingface/transformers

124,593

PaddlePaddle/PaddleNLP

11,384

mindspore-ai/models

334

JiachengLi1995/UCTopic

See all 8 implementations

Tasks

Add Remove

Common Sense Reasoning

Entity Typing

Extractive Question-Answering

Language Modelling

Named Entity Recognition

Named Entity Recognition (NER)

Question Answering

Relation Classification

Relation Extraction

Datasets

SQuAD CoNLL 2003

TACRED CoNLL

ReCoRD

Open Entity

Results from the Paper

Edit

Ranked #1 on Entity Typing on Open Entity

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Named Entity Recognition (NER)	CoNLL 2003 (English)	LUKE 483M	F1	94.3	# 2	Compare
Entity Typing	Open Entity	MLMET	F1	78.2	# 1	Compare
Common Sense Reasoning	ReCoRD	LUKE 483M	F1	91.2	# 12	Compare
Common Sense Reasoning	ReCoRD	LUKE 483M	EM	90.6	# 10	Compare
Question Answering	SQuAD1.1	LUKE (single model)	EM	90.202	# 2	Compare
Question Answering	SQuAD1.1	LUKE (single model)	F1	95.379	# 3	Compare
Question Answering	SQuAD1.1	LUKE 483M	F1	95.4	# 2	Compare
Question Answering	SQuAD1.1	LUKE	EM	90.2	# 4	Compare
Question Answering	SQuAD1.1 dev	LUKE 483M	F1	95	# 4	Compare
Question Answering	SQuAD1.1 dev	LUKE	EM	89.8	# 2	Compare
Question Answering	SQuAD2.0	LUKE (single model)	EM	87.429	# 83	Compare
Question Answering	SQuAD2.0	LUKE (single model)	F1	90.163	# 84	Compare
Question Answering	SQuAD2.0	LUKE 483M	F1	90.2	# 83	Compare
Relation Classification	TACRED	LUKE 483M	F1	72.7	# 14	Compare
Relation Extraction	TACRED	LUKE	F1 (1% Few-Shot)	17.0	# 4	Compare
Relation Extraction	TACRED	LUKE	F1 (5% Few-Shot)	51.6	# 3	Compare

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • GELU • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Residual Connection • Scaled Dot-Product Attention • Softmax • Weight Decay • WordPiece

Edit Social Preview

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove