Sense Vocabulary Compression through the Semantic Knowledge of WordNet for Neural Word Sense Disambiguation

14 May 2019 Loïc Vial Benjamin Lecouteux Didier Schwab

In this article, we tackle the issue of the limited quantity of manually sense annotated corpora for the task of word sense disambiguation, by exploiting the semantic relationships between senses such as synonymy, hypernymy and hyponymy, in order to compress the sense vocabulary of Princeton WordNet, and thus reduce the number of different sense tags that must be observed to disambiguate all words of the lexical database. We propose two different methods that greatly reduces the size of neural WSD models, with the benefit of improving their coverage without additional training data, and without impacting their precision... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK BENCHMARK
Word Sense Disambiguation SemEval 2007 Task 17 SemCor+WNGC, hypernyms F1 73.4 # 1
Word Sense Disambiguation SemEval 2007 Task 7 SemCor+WNGC, hypernyms F1 90.4 # 1
Word Sense Disambiguation SemEval 2013 Task 12 SemCor+WNGC, hypernyms F1 78.7 # 1
Word Sense Disambiguation SemEval 2015 Task 13 SemCor+WNGC, hypernyms F1 82.6 # 1
Word Sense Disambiguation SensEval 2 SemCor+WNGC, hypernyms F1 79.7 # 1
Word Sense Disambiguation SensEval 3 Task 1 SemCor+WNGC, hypernyms F1 77.8 # 1
Word Sense Disambiguation Supervised: SemCor+WNGC, hypernyms Senseval 2 79.7 # 2
Senseval 3 77.8 # 3
SemEval 2007 73.4 # 3
SemEval 2013 78.7 # 5
SemEval 2015 82.6 # 2

Methods used in the Paper


METHOD TYPE
Residual Connection
Skip Connections
Attention Dropout
Regularization
Linear Warmup With Linear Decay
Learning Rate Schedules
Weight Decay
Regularization
GELU
Activation Functions
Dense Connections
Feedforward Networks
Adam
Stochastic Optimization
WordPiece
Subword Segmentation
Softmax
Output Functions
Dropout
Regularization
Multi-Head Attention
Attention Modules
Layer Normalization
Normalization
Scaled Dot-Product Attention
Attention Mechanisms
BERT
Language Models