Classical Structured Prediction Losses for Sequence to Sequence Learning

NAACL 2018 Sergey EdunovMyle OttMichael AuliDavid GrangierMarc'Aurelio Ranzato

There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam. In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models... (read more)

PDF Abstract

Evaluation results from the paper

Task Dataset Model Metric name Metric value Global rank Compare
Machine Translation IWSLT2015 German-English ConvS2S+Risk BLEU score 32.93 # 6
Machine Translation IWSLT2015 German-English ConvS2S (MLE+SLE) BLEU score 32.84 # 7