The Word Sense Disambiguation Test Suite at WMT18
We present a task to measure an MT system{'}s capability to translate ambiguous words with their correct sense according to the given context. The task is based on the German{--}English Word Sense Disambiguation (WSD) test set ContraWSD (Rios Gonzales et al., 2017), but it has been filtered to reduce noise, and the evaluation has been adapted to assess MT output directly rather than scoring existing translations. We evaluate all German{--}English submissions to the WMT{'}18 shared translation task, plus a number of submissions from previous years, and find that performance on the task has markedly improved compared to the 2016 WMT submissions (81{\%}→93{\%} accuracy on the WSD task). We also find that the unsupervised submissions to the task have a low WSD capability, and predominantly translate ambiguous source words with the same sense.
PDF Abstract