no code implementations • 9 May 2022 • Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes
In this paper we share findings from our effort to build practical machine translation (MT) systems capable of translating across over one thousand languages.
no code implementations • 2 May 2022 • Bowen Liang, Pidong Wang, Yuan Cao
Label smoothing is ubiquitously applied in Neural Machine Translation (NMT) training.
no code implementations • ACL 2022 • Yong Cheng, Ankur Bapna, Orhan Firat, Yuan Cao, Pidong Wang, Wolfgang Macherey
The dominant inductive bias applied to these models is a shared vocabulary and a shared set of parameters across languages; the inputs and labels corresponding to examples drawn from different language pairs might still reside in distinct sub-spaces.