Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation

NAACL 2022  ยท  Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang ยท

We introduce Bi-SimCut: a simple but effective training strategy to boost neural machine translation (NMT) performance. It consists of two procedures: bidirectional pretraining and unidirectional finetuning. Both procedures utilize SimCut, a simple regularization method that forces the consistency between the output distributions of the original and the cutoff sentence pairs. Without leveraging extra dataset via back-translation or integrating large-scale pretrained model, Bi-SimCut achieves strong translation performance across five translation benchmarks (data sizes range from 160K to 20.2M): BLEU scores of 31.16 for en -> de and 38.37 for de -> en on the IWSLT14 dataset, 30.78 for en -> de and 35.15 for de -> en on the WMT14 dataset, and 27.17 for zh -> en on the WMT17 dataset. SimCut is not a new method, but a version of Cutoff (Shen et al., 2020) simplified and adapted for NMT, and it could be considered as a perturbation-based method. Given the universality and simplicity of SimCut and Bi-SimCut, we believe they can serve as strong baselines for future NMT research.

PDF Abstract NAACL 2022 PDF NAACL 2022 Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Machine Translation IWSLT2014 English-German SimCut BLEU score 30.98 # 3
Machine Translation IWSLT2014 English-German Bi-SimCut BLEU score 31.16 # 2
Machine Translation IWSLT2014 German-English Bi-SimCut BLEU score 38.37 # 3
Machine Translation IWSLT2014 German-English SimCut BLEU score 37.81 # 6
Machine Translation WMT2014 English-German Bi-SimCut BLEU score 30.78 # 7
Machine Translation WMT2014 English-German SimCut BLEU score 30.56 # 10
Machine Translation WMT2014 German-English Bi-SimCut BLEU score 35.15 # 1
Machine Translation WMT2014 German-English SimCut BLEU score 34.86 # 3

Methods


No methods listed for this paper. Add relevant methods here