Search Results for author: Melvin Johnson

Found 30 papers, 7 papers with code

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation

2 code implementations ICML 2020 Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

Zero-Shot Cross-Lingual Transfer

DOCmT5: Document-Level Pretraining of Multilingual Language Models

no code implementations16 Dec 2021 Chia-Hsuan Lee, Aditya Siddhant, Viresh Ratnakar, Melvin Johnson

In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence language model pre-trained with large scale parallel documents.

Document Translation Language Modelling +2

Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents

no code implementations21 Sep 2021 Biao Zhang, Ankur Bapna, Melvin Johnson, Ali Dabirmoghaddam, Naveen Arivazhagan, Orhan Firat

Using simple concatenation-based DocNMT, we explore the effect of 3 factors on multilingual transfer: the number of document-supervised teacher languages, the data schedule for parallel documents at training, and the data condition of parallel documents (genuine vs. backtranslated).

Machine Translation Translation

HintedBT: Augmenting Back-Translation with Quality and Transliteration Hints

no code implementations EMNLP 2021 Sahana Ramnath, Melvin Johnson, Abhirut Gupta, Aravindan Raghuveer

For such cases, we propose training the model with additional hints (as target tags on the decoder) that provide information about the operation required on the source (translation or both translation and transliteration).

Data Augmentation Translation +1

nmT5 - Is parallel data still relevant for pre-training massively multilingual language models?

no code implementations ACL 2021 Mihir Kale, Aditya Siddhant, Rami Al-Rfou, Linting Xue, Noah Constant, Melvin Johnson

Recently, mT5 - a massively multilingual version of T5 - leveraged a unified text-to-text format to attain state-of-the-art results on a wide variety of multilingual NLP tasks.

Language Modelling Machine Translation +2

MergeDistill: Merging Pre-trained Language Models using Distillation

no code implementations5 Jun 2021 Simran Khanuja, Melvin Johnson, Partha Talukdar

Pre-trained multilingual language models (LMs) have achieved state-of-the-art results in cross-lingual transfer, but they often lead to an inequitable representation of languages due to limited capacity, skewed pre-training data, and sub-optimal vocabularies.

Cross-Lingual Transfer Knowledge Distillation

nmT5 -- Is parallel data still relevant for pre-training massively multilingual language models?

no code implementations3 Jun 2021 Mihir Kale, Aditya Siddhant, Noah Constant, Melvin Johnson, Rami Al-Rfou, Linting Xue

Recently, mT5 - a massively multilingual version of T5 - leveraged a unified text-to-text format to attain state-of-the-art results on a wide variety of multilingual NLP tasks.

Language Modelling Machine Translation +2

Gradient-guided Loss Masking for Neural Machine Translation

no code implementations26 Feb 2021 Xinyi Wang, Ankur Bapna, Melvin Johnson, Orhan Firat

To mitigate the negative effect of low quality training data on the performance of neural machine translation models, most existing strategies focus on filtering out harmful data before training starts.

Machine Translation Translation

They, Them, Theirs: Rewriting with Gender-Neutral English

no code implementations12 Feb 2021 Tony Sun, Kellie Webster, Apu Shah, William Yang Wang, Melvin Johnson

Responsible development of technology involves applications being inclusive of the diverse set of users they hope to support.

Distilling Large Language Models into Tiny and Effective Students using pQRNN

no code implementations21 Jan 2021 Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson

Our strong results suggest that our approach is great for latency-sensitive applications while being able to leverage large mBERT-like models.

Data Augmentation Semantic Parsing

Explicit Alignment Objectives for Multilingual Bidirectional Encoders

no code implementations NAACL 2021 Junjie Hu, Melvin Johnson, Orhan Firat, Aditya Siddhant, Graham Neubig

Pre-trained cross-lingual encoders such as mBERT (Devlin et al., 2019) and XLMR (Conneau et al., 2020) have proven to be impressively effective at enabling transfer-learning of NLP systems from high-resource languages to low-resource languages.

Sentence Classification Transfer Learning +1

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

2 code implementations24 Mar 2020 Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

Cross-Lingual Transfer

Adaptive Scheduling for Multi-Task Learning

no code implementations13 Sep 2019 Sébastien Jean, Orhan Firat, Melvin Johnson

To train neural machine translation models simultaneously on multiple tasks (languages), it is common to sample each task uniformly or in proportion to dataset sizes.

Machine Translation Multi-Task Learning +1

Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation

no code implementations1 Sep 2019 Aditya Siddhant, Melvin Johnson, Henry Tsai, Naveen Arivazhagan, Jason Riesa, Ankur Bapna, Orhan Firat, Karthik Raman

The recently proposed massively multilingual neural machine translation (NMT) system has been shown to be capable of translating over 100 languages to and from English within a single model.

Cross-Lingual Transfer Machine Translation +2

Small and Practical BERT Models for Sequence Labeling

no code implementations IJCNLP 2019 Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer

We propose a practical scheme to train a single multilingual sequence labeling model that yields state of the art results and is small and fast enough to run on a single CPU.

Part-Of-Speech Tagging

Direct speech-to-speech translation with a sequence-to-sequence model

no code implementations12 Apr 2019 Ye Jia, Ron J. Weiss, Fadi Biadsy, Wolfgang Macherey, Melvin Johnson, Zhifeng Chen, Yonghui Wu

We present an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.

Speech Synthesis Speech-to-Speech Translation +3

The Missing Ingredient in Zero-Shot Neural Machine Translation

no code implementations17 Mar 2019 Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Roee Aharoni, Melvin Johnson, Wolfgang Macherey

Multilingual Neural Machine Translation (NMT) models are capable of translating between multiple source and target languages.

Machine Translation Translation

Massively Multilingual Neural Machine Translation

no code implementations NAACL 2019 Roee Aharoni, Melvin Johnson, Orhan Firat

Our experiments on a large-scale dataset with 102 languages to and from English and up to one million examples per direction also show promising results, surpassing strong bilingual baselines and encouraging future work on massively multilingual NMT.

Machine Translation Translation

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

3 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Leveraging Weakly Supervised Data to Improve End-to-End Speech-to-Text Translation

no code implementations5 Nov 2018 Ye Jia, Melvin Johnson, Wolfgang Macherey, Ron J. Weiss, Yuan Cao, Chung-Cheng Chiu, Naveen Ari, Stella Laurenzo, Yonghui Wu

In this paper, we demonstrate that using pre-trained MT or text-to-speech (TTS) synthesis models to convert weakly supervised data into speech-to-translation pairs for ST training can be more effective than multi-task learning.

Machine Translation Multi-Task Learning +3

Zero-Shot Cross-lingual Classification Using Multilingual Neural Machine Translation

no code implementations12 Sep 2018 Akiko Eriguchi, Melvin Johnson, Orhan Firat, Hideto Kazawa, Wolfgang Macherey

However, little attention has been paid to leveraging representations learned by a multilingual NMT system to enable zero-shot multilinguality in other NLP tasks.

Cross-Lingual Transfer General Classification +4

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

4 code implementations TACL 2017 Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, Jeffrey Dean

In addition to improving the translation quality of language pairs that the model was trained with, our models can also learn to perform implicit bridging between language pairs never seen explicitly during training, showing that transfer learning and zero-shot translation is possible for neural translation.

Machine Translation Transfer Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.