ERNIE: Enhanced Language Representation with Informative Entities

Neural language representation models such as BERT pre-trained on large-scale corpora can well capture rich semantic patterns from plain text, and be fine-tuned to consistently improve the performance of various NLP tasks. However, the existing pre-trained language models rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better language understanding. We argue that informative entities in KGs can enhance language representation with external knowledge. In this paper, we utilize both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE), which can take full advantage of lexical, syntactic, and knowledge information simultaneously. The experimental results have demonstrated that ERNIE achieves significant improvements on various knowledge-driven tasks, and meanwhile is comparable with the state-of-the-art model BERT on other common NLP tasks. The source code of this paper can be obtained from

PDF Abstract ACL 2019 PDF ACL 2019 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Linguistic Acceptability CoLA ERNIE Accuracy 52.3% # 21
Relation Extraction FewRel ERNIE F1 88.32 # 1
Precision 88.49 # 1
Recall 88.44 # 1
Entity Linking FIGER ERNIE Accuracy 57.19 # 1
Macro F1 76.51 # 1
Micro F1 73.39 # 1
Semantic Textual Similarity MRPC ERNIE Accuracy 88.2% # 14
Natural Language Inference MultiNLI ERNIE Matched 84.0 # 22
Mismatched 83.2 # 19
Entity Typing Open Entity ERNIE F1 75.56 # 3
Precision 78.42 # 3
Recall 72.9 # 3
Natural Language Inference QNLI ERNIE Accuracy 91.3% # 20
Paraphrase Identification Quora Question Pairs ERNIE F1 71.2 # 11
Natural Language Inference RTE ERNIE Accuracy 68.8% # 26
Sentiment Analysis SST-2 Binary classification ERNIE Accuracy 93.5 # 30
Semantic Textual Similarity STS Benchmark ERNIE Pearson Correlation 0.832 # 22
Relation Extraction TACRED ERNIE F1 67.97 # 21