Capturing Longer Context for Document-level Neural Machine Translation: A Multi-resolutional Approach

18 Oct 2020  ·  Zewei Sun, Mingxuan Wang, Hao Zhou, Chengqi Zhao, ShuJian Huang, Jiajun Chen, Lei LI ·

Discourse context has been proven useful when translating documents. It is quite a challenge to incorporate long document context in the prevailing neural machine translation models such as Transformer... In this paper, we propose multi-resolutional (MR) Doc2Doc, a method to train a neural sequence-to-sequence model for document-level translation. Our trained model can simultaneously translate sentence by sentence as well as a document as a whole. We evaluate our method and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that MR Doc2Doc outperforms sentence-level models and previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation. read more

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods