Search Results for author: Felipe Perez

Found 4 papers, 3 papers with code

Improving Transformer Optimization Through Better Initialization

1 code implementation ICML 2020 Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

As Transformer models are becoming larger and more expensive to train, recent research has focused on understanding and improving optimization in these models.

Language Modelling Machine Translation +1

Improving Transformer Optimization Through Better Initialization

1 code implementation ICML 2020 Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

As Transformer models are becoming larger and more expensive to train, recent research has focused on understanding and improving optimization in these models.

Language Modelling Machine Translation +1

DiMS: Distilling Multiple Steps of Iterative Non-Autoregressive Transformers for Machine Translation

1 code implementation7 Jun 2022 Sajad Norouzi, Rasa Hosseinzadeh, Felipe Perez, Maksims Volkovs

The student is optimized to predict the output of the teacher after multiple decoding steps while the teacher follows the student via a slow-moving average.

Machine Translation Translation

Improving Non-Autoregressive Translation Models Without Distillation

no code implementations ICLR 2022 Xiao Shi Huang, Felipe Perez, Maksims Volkovs

Empirically, we show that CMLMC achieves state-of-the-art NAR performance when trained on raw data without distillation and approaches AR performance on multiple datasets.

Language Modelling Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.