Search Results for author: Naman Goyal

Found 41 papers, 21 papers with code

Findings of the WMT 2020 Shared Task on Parallel Corpus Filtering and Alignment

no code implementations WMT (EMNLP) 2020 Philipp Koehn, Vishrav Chaudhary, Ahmed El-Kishky, Naman Goyal, Peng-Jen Chen, Francisco Guzmán

Following two preceding WMT Shared Task on Parallel Corpus Filtering (Koehn et al., 2018, 2019), we posed again the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting the highest-quality data to be used to train ma-chine translation systems.

Sentence Translation

Facebook AI’s WMT20 News Translation Task Submission

no code implementations WMT (EMNLP) 2020 Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.

Data Augmentation Translation

Text-To-4D Dynamic Scene Generation

no code implementations26 Jan 2023 Uriel Singer, Shelly Sheynin, Adam Polyak, Oron Ashual, Iurii Makarov, Filippos Kokkinos, Naman Goyal, Andrea Vedaldi, Devi Parikh, Justin Johnson, Yaniv Taigman

We present MAV3D (Make-A-Video3D), a method for generating three-dimensional dynamic scenes from text descriptions.

Scene Generation

Scaling Laws for Generative Mixed-Modal Language Models

no code implementations10 Jan 2023 Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer

To better understand the scaling properties of such mixed-modal models, we conducted over 250 experiments using seven different modalities and model sizes ranging from 8 million to 30 billion, trained on 5-100 billion tokens.

A survey on Self Supervised learning approaches for improving Multimodal representation learning

no code implementations20 Oct 2022 Naman Goyal

Recently self supervised learning has seen explosive growth and use in variety of machine learning tasks because of its ability to avoid the cost of annotating large-scale datasets.

Representation Learning Self-Supervised Learning +1

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

2 code implementations5 Aug 2022 Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks.

Continual Learning

On the Role of Bidirectionality in Language Model Pre-Training

no code implementations24 May 2022 Mikel Artetxe, Jingfei Du, Naman Goyal, Luke Zettlemoyer, Ves Stoyanov

Prior work on language model pre-training has explored different architectures and learning objectives, but differences in data, hyperparameters and evaluation make a principled comparison difficult.

Language Modelling Text Infilling

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

no code implementations NAACL 2022 Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.

named-entity-recognition Named Entity Recognition +3

Graph Neural Networks for Image Classification and Reinforcement Learning using Graph representations

no code implementations7 Mar 2022 Naman Goyal, David Steiner

In this paper, we will evaluate the performance of graph neural networks in two distinct domains: computer vision and reinforcement learning.

Image Classification Inductive Bias +3

CM3: A Causal Masked Multimodal Model of the Internet

no code implementations19 Jan 2022 Armen Aghajanyan, Bernie Huang, Candace Ross, Vladimir Karpukhin, Hu Xu, Naman Goyal, Dmytro Okhonko, Mandar Joshi, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer

We introduce CM3, a family of causally masked generative models trained over a large corpus of structured multi-modal documents that can contain both text and image tokens.

Entity Disambiguation Entity Linking

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations20 Dec 2021 Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

no code implementations ACL (IWSLT) 2021 Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task.

Transfer Learning Translation

Larger-Scale Transformers for Multilingual Masked Language Modeling

no code implementations ACL (RepL4NLP) 2021 Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau

Our model also outperforms the RoBERTa-Large model on several English tasks of the GLUE benchmark by 0. 3% on average while handling 99 more languages.

Masked Language Modeling XLM-R

BASE Layers: Simplifying Training of Large, Sparse Models

1 code implementation30 Mar 2021 Mike Lewis, Shruti Bhosale, Tim Dettmers, Naman Goyal, Luke Zettlemoyer

Sparse layers can dramatically improve the efficiency of training and inference by routing each token to specialized expert modules that contain only a small fraction of the model parameters.

Multilingual Autoregressive Entity Linking

1 code implementation23 Mar 2021 Nicola De Cao, Ledell Wu, Kashyap Popat, Mikel Artetxe, Naman Goyal, Mikhail Plekhanov, Luke Zettlemoyer, Nicola Cancedda, Sebastian Riedel, Fabio Petroni

Moreover, in a zero-shot setting on languages with no training data at all, mGENRE treats the target language as a latent variable that is marginalized at prediction time.

Ranked #2 on Entity Disambiguation on Mewsli-9 (using extra training data)

Entity Disambiguation Entity Linking

Facebook AI's WMT20 News Translation Task Submission

no code implementations16 Nov 2020 Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.

Data Augmentation Translation

Beyond English-Centric Multilingual Machine Translation

7 code implementations21 Oct 2020 Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages.

Machine Translation Translation

Better Fine-Tuning by Reducing Representational Collapse

3 code implementations ICLR 2021 Armen Aghajanyan, Akshat Shrivastava, Anchit Gupta, Naman Goyal, Luke Zettlemoyer, Sonal Gupta

Although widely adopted, existing approaches for fine-tuning pre-trained language models have been shown to be unstable across hyper-parameter settings, motivating recent work on trust region methods.

Abstractive Text Summarization Cross-Lingual Natural Language Inference

Multilingual Translation with Extensible Multilingual Pretraining and Finetuning

5 code implementations2 Aug 2020 Yuqing Tang, Chau Tran, Xi-An Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan

Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages.

Machine Translation Translation

Multilingual Denoising Pre-training for Neural Machine Translation

5 code implementations22 Jan 2020 Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

Denoising Sentence +2

Unsupervised Cross-lingual Representation Learning at Scale

26 code implementations ACL 2020 Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Cross-Lingual Transfer Multilingual NLP +2

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

42 code implementations ACL 2020 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Abstractive Text Summarization Denoising +5

RoBERTa: A Robustly Optimized BERT Pretraining Approach

58 code implementations26 Jul 2019 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

 Ranked #1 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (Wasserstein Distance (WD) metric, using extra training data)

Document Image Classification Language Modelling +13

The Social Dynamics of Language Change in Online Networks

no code implementations7 Sep 2016 Rahul Goel, Sandeep Soni, Naman Goyal, John Paparrizos, Hanna Wallach, Fernando Diaz, Jacob Eisenstein

Language change is a complex social phenomenon, revealing pathways of communication and sociocultural influence.

Cannot find the paper you are looking for? You can Submit a new open access paper.