Search Results for author: Naveen Arivazhagan

Found 17 papers, 4 papers with code

Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents

no code implementations • ACL 2022 • Biao Zhang, Ankur Bapna, Melvin Johnson, Ali Dabirmoghaddam, Naveen Arivazhagan, Orhan Firat

Using simple concatenation-based DocNMT, we explore the effect of 3 factors on the transfer: the number of teacher languages with document level data, the balance between document and sentence level data at training, and the data condition of parallel documents (genuine vs. backtranslated).

Machine Translation Sentence +2

Paper
Add Code

Simultaneous Translation

no code implementations • EMNLP 2020 • Liang Huang, Colin Cherry, Mingbo Ma, Naveen Arivazhagan, Zhongjun He

Simultaneous translation, which performs translation concurrently with the source speech, is widely useful in many scenarios such as international conferences, negotiations, press releases, legal proceedings, and medicine.

Machine Translation speech-recognition +3

Paper
Add Code

Sentence Boundary Augmentation For Neural Machine Translation Robustness

no code implementations • 21 Oct 2020 • Daniel Li, Te I, Naveen Arivazhagan, Colin Cherry, Dirk Padfield

Specifically, in the context of long-form speech translation systems, where the input transcripts come from Automatic Speech Recognition (ASR), the NMT models have to handle errors including phoneme substitutions, grammatical structure, and sentence boundaries, all of which pose challenges to NMT robustness.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

Language-agnostic BERT Sentence Embedding

6 code implementations • ACL 2022 • Fangxiaoyu Feng, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, Wei Wang

While BERT is an effective method for learning monolingual sentence embeddings for semantic similarity and embedding based transfer learning (Reimers and Gurevych, 2019), BERT based cross-lingual sentence embeddings have yet to be explored.

Language Modelling Masked Language Modeling +11

708

Paper
Code

Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

no code implementations • ACL 2020 • Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Chen, Sneha Kudugunta, Naveen Arivazhagan, Yonghui Wu

Over the last few years two promising research directions in low-resource neural machine translation (NMT) have emerged.

Low-Resource Neural Machine Translation NMT +1

Paper
Add Code

Re-translation versus Streaming for Simultaneous Translation

no code implementations • WS 2020 • Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, George Foster

There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available.

Attribute Data Augmentation +2

Paper
Add Code

Controlling Computation versus Quality for Neural Sequence Models

no code implementations • 17 Feb 2020 • Ankur Bapna, Naveen Arivazhagan, Orhan Firat

Further, methods that adapt the amount of computation to the example focus on finding a fixed inference-time computational graph per example, ignoring any external computational budgets or varying inference time limitations.

Representation Learning

Paper
Add Code

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation

1 code implementation • 6 Dec 2019 • Naveen Arivazhagan, Colin Cherry, Te I, Wolfgang Macherey, Pallavi Baljekar, George Foster

As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows.

Machine Translation speech-recognition +2

2,757

Paper
Code

Simple, Scalable Adaptation for Neural Machine Translation

no code implementations • IJCNLP 2019 • Ankur Bapna, Naveen Arivazhagan, Orhan Firat

We evaluate our approach on two tasks: (i) Domain Adaptation and (ii) Massively Multilingual NMT.

Domain Adaptation Machine Translation +2

Paper
Add Code

Investigating Multilingual NMT Representations at Scale

no code implementations • IJCNLP 2019 • Sneha Reddy Kudugunta, Ankur Bapna, Isaac Caswell, Naveen Arivazhagan, Orhan Firat

Multilingual Neural Machine Translation (NMT) models have yielded large empirical success in transfer learning settings.

Cross-Lingual Transfer Machine Translation +3

Paper
Add Code

Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation

no code implementations • 1 Sep 2019 • Aditya Siddhant, Melvin Johnson, Henry Tsai, Naveen Arivazhagan, Jason Riesa, Ankur Bapna, Orhan Firat, Karthik Raman

The recently proposed massively multilingual neural machine translation (NMT) system has been shown to be capable of translating over 100 languages to and from English within a single model.

Cross-Lingual Transfer Machine Translation +3

Paper
Add Code

Small and Practical BERT Models for Sequence Labeling

no code implementations • IJCNLP 2019 • Henry Tsai, Jason Riesa, Melvin Johnson, Naveen Arivazhagan, Xin Li, Amelia Archer

We propose a practical scheme to train a single multilingual sequence labeling model that yields state of the art results and is small and fast enough to run on a single CPU.

Part-Of-Speech Tagging

Paper
Add Code

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

no code implementations • 11 Jul 2019 • Naveen Arivazhagan, Ankur Bapna, Orhan Firat, Dmitry Lepikhin, Melvin Johnson, Maxim Krikun, Mia Xu Chen, Yuan Cao, George Foster, Colin Cherry, Wolfgang Macherey, Zhifeng Chen, Yonghui Wu

We introduce our efforts towards building a universal neural machine translation (NMT) system capable of translating between any language pair.

Machine Translation NMT +2

Paper
Add Code

Monotonic Infinite Lookback Attention for Simultaneous Machine Translation

no code implementations • ACL 2019 • Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, Colin Raffel

Simultaneous machine translation begins to translate each source sentence before the source speaker is finished speaking, with applications to live and streaming scenarios.

Machine Translation NMT +2