GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge

IJCNLP 2019  ·  Luyao Huang, Chi Sun, Xipeng Qiu, Xuanjing Huang ·

Word Sense Disambiguation (WSD) aims to find the exact sense of an ambiguous word in a particular context. Traditional supervised methods rarely take into consideration the lexical resources like WordNet, which are widely utilized in knowledge-based methods. Recent studies have shown the effectiveness of incorporating gloss (sense definition) into neural networks for WSD. However, compared with traditional word expert supervised methods, they have not achieved much improvement. In this paper, we focus on how to better leverage gloss knowledge in a supervised neural WSD system. We construct context-gloss pairs and propose three BERT-based models for WSD. We fine-tune the pre-trained BERT model on SemCor3.0 training corpus and the experimental results on several English all-words WSD benchmark datasets show that our approach outperforms the state-of-the-art systems.

PDF Abstract IJCNLP 2019 PDF IJCNLP 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Word Sense Disambiguation Supervised: GlossBERT Senseval 2 77.7 # 13
Senseval 3 75.2 # 13
SemEval 2007 72.5 # 10
SemEval 2013 76.1 # 12
SemEval 2015 80.4 # 11
Word Sense Disambiguation WiC-TSV GlossBert-ws Task 1 Accuracy: all 75.9 # 3
Task 1 Accuracy: general purpose 75.2 # 1
Task 1 Accuracy: domain specific 76.7 # 4
Entity Linking WiC-TSV GlossBert-ws Task 1 Accuracy: all 75.9 # 3
Task 1 Accuracy: general purpose 75.2 # 1
Task 1 Accuracy: domain specific 76.7 # 4

Methods