Confidence through Attention

MTSummit 2017  ·  Matīss Rikters, Mark Fishel ·

Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens. In this work, we use attention distributions as a confidence metric for output translations. We present two strategies of using the attention distributions: filtering out bad translations from a large back-translated corpus, and selecting the best translation in a hybrid setup of two different translation systems. While manual evaluation indicated only a weak correlation between our confidence score and human judgments, the use-cases showed improvements of up to 2.22 BLEU points for filtering and 0.99 points for hybrid translation, tested on English<->German and English<->Latvian translation.

PDF Abstract MTSummit 2017 PDF MTSummit 2017 Abstract

Datasets


  Add Datasets introduced or used in this paper
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Machine Translation WMT 2017 Latvian-English Attention-based Hybrid NMT combination BLEU 14.83 # 3

Methods


No methods listed for this paper. Add relevant methods here