Linguistic evaluation of German-English Machine Translation using a Test Suite

WS 2019 · Eleftherios Avramidis, Vivien Macketanz, Ursula Strohriegel, Hans Uszkoreit ·

We present the results of the application of a grammatical test suite for German$\rightarrow$English MT on the systems submitted at WMT19, with a detailed analysis for 107 phenomena organized in 14 categories. The systems still translate wrong one out of four test items in average. Low performance is indicated for idioms, modals, pseudo-clefts, multi-word expressions and verb valency. When compared to last year, there has been a improvement of function words, non-verbal agreement and punctuation. More detailed conclusions about particular systems and phenomena are also presented.