Search Results for author: Giri Anantharaman

Found 2 papers, 0 papers with code

Larger-Scale Transformers for Multilingual Masked Language Modeling

no code implementations ACL (RepL4NLP) 2021 Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau

Our model also outperforms the RoBERTa-Large model on several English tasks of the GLUE benchmark by 0. 3% on average while handling 99 more languages.

Masked Language Modeling XLM-R

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations20 Dec 2021 Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.