Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison

Word Sense Disambiguation is a long-standing task in Natural Language Processing, lying at the core of human language understanding. However, the evaluation of automatic systems has been problematic, mainly due to the lack of a reliable evaluation framework. In this paper we develop a unified evaluation framework and analyze the performance of various Word Sense Disambiguation systems in a fair setup. The results show that supervised systems clearly outperform knowledge-based models. Among the supervised systems, a linear classifier trained on conventional local features still proves to be a hard baseline to beat. Nonetheless, recent approaches exploiting neural networks on unlabeled corpora achieve promising results, surpassing this hard baseline in most test sets.

PDF Abstract
No code implementations yet. Submit your code now
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Word Sense Disambiguation Knowledge-based: WN 1st sense baseline All 65.2 # 4
Senseval 2 66.8 # 4
Senseval 3 66.2 # 1
SemEval 2007 55.2 # 2
SemEval 2013 63.0 # 5
SemEval 2015 67.8 # 3


No methods listed for this paper. Add relevant methods here