Demystifying Learning of Unsupervised Neural Machine Translation

1 Jan 2021  ·  Guanlin Li, Lemao Liu, Taro Watanabe, Conghui Zhu, Tiejun Zhao ·

Unsupervised Neural Machine Translation or UNMT has received great attention in recent years. Though tremendous empirical improvements have been achieved, there still lacks theory-oriented investigation and thus some fundamental questions like \textit{why} certain training protocol can work or not under \textit{what} circumstances have not yet been well understood. This paper attempts to provide theoretical insights for the above questions. Specifically, following the methodology of comparative study, we leverage two perspectives, i) \textit{marginal likelihood maximization} and ii) \textit{mutual information} from information theory, to understand the different learning effects from the standard training protocol and its variants. Our detailed analyses reveal several critical conditions for the successful training of UNMT.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here