A Taxonomy and Study of Critical Errors in Machine Translation

EAMT 2022  ·  Khetam Al Sharou, Lucia Specia ·

Not all machine mistranslations are equal. For example, mistranslating a date or time in an appointment, mistranslating the number or currency in a contract, or hallucinating profanity may lead to consequences for the users even when MT is just used for gisting. The severity of the errors is important, but overlooked, aspect of MT quality evaluation. In this paper, we present the result of our effort to bring awareness to the problem of critical translation errors. We study, validate and improve an initial taxonomy of critical errors with the view of providing guidance for critical error analysis, annotation and mitigation. We test the taxonomy for three different languages to examine to what extent it generalises across languages. We provide an account of factors that affect annotation tasks along with recommendations on how to improve the practice in future work. We also study the impact of the source text on generating critical errors in the translation and, based on this, propose a set of recommendations on aspects of the MT that need further scrutiny, especially for user-generated content, to avoid generating such errors, and hence improve online communication.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here