A Call for Clarity in Reporting BLEU Scores

EMNLP 2018 Matt Post

The field of machine translation faces an under-recognized problem because of inconsistency in the reporting of scores from its dominant metric. Although people refer to "the" BLEU score, BLEU is in fact a parameterized metric whose values can vary wildly with changes to these parameters... (read more)

PDF Abstract

Evaluation results from the paper


  Submit results from this paper to get state-of-the-art GitHub badges and help community compare results to other papers.