Search Results for author: James Cross

Found 26 papers, 11 papers with code

Non-autoregressive Translation with Disentangled Context Transformer

1 code implementation • ICML 2020 • Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.

Ranked #54 on Machine Translation on WMT2014 English-German

Machine Translation Sentence +1

Paper
Code

Facebook AI’s WMT21 News Translation Task Submission

1 code implementation • WMT (EMNLP) 2021 • Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

We describe Facebook’s multilingual model submission to the WMT2021 shared task on news translation.

Translation

29,314

Paper
Code

Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages

no code implementations • 7 Feb 2023 • Simeng Sun, Maha Elbayad, Anna Sun, James Cross

With multilingual machine translation (MMT) models continuing to grow in size and number of supported languages, it is natural to reuse and upgrade existing models to save computation as data becomes available in more languages.

Machine Translation Translation

Paper
Add Code

No Language Left Behind: Scaling Human-Centered Machine Translation

8 code implementations • Meta AI 2022 • NLLB team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang

Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today.

Ranked #1 on Machine Translation on IWSLT2017 French-English (SacreBLEU metric)

Machine Translation Translation

29,322

Paper
Code

Multilingual Neural Machine Translation with Deep Encoder and Multiple Shallow Decoders

no code implementations • EACL 2021 • Xiang Kong, Adithya Renduchintala, James Cross, Yuqing Tang, Jiatao Gu, Xian Li

Recent work in multilingual translation advances translation quality surpassing bilingual baselines using deep transformer models with increased capacity.

Decoder Machine Translation +1

Paper
Add Code

Multilingual Machine Translation with Hyper-Adapters

3 code implementations • 22 May 2022 • Christos Baziotis, Mikel Artetxe, James Cross, Shruti Bhosale

We find that hyper-adapters are more parameter efficient than regular adapters, reaching the same performance with up to 12 times less parameters.

Machine Translation Translation

Paper
Code

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

no code implementations • NAACL 2022 • Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

How Robust is Neural Machine Translation to Language Imbalance in Multilingual Tokenizer Training?

no code implementations • AMTA 2022 • Shiyue Zhang, Vishrav Chaudhary, Naman Goyal, James Cross, Guillaume Wenzek, Mohit Bansal, Francisco Guzman

Since a skewed data distribution is considered to be harmful, a sampling strategy is usually used to balance languages in the corpus.

Machine Translation Translation

Paper
Add Code

Data Selection Curriculum for Neural Machine Translation

no code implementations • 25 Mar 2022 • Tasnim Mohiuddin, Philipp Koehn, Vishrav Chaudhary, James Cross, Shruti Bhosale, Shafiq Joty

In this work, we introduce a two-stage curriculum training framework for NMT where we fine-tune a base NMT model on subsets of data, selected by both deterministic scoring using pre-trained methods and online scoring that considers prediction scores of the emerging NMT model.

Machine Translation NMT +1

Paper
Add Code

Tricks for Training Sparse Translation Models

no code implementations • NAACL 2022 • Dheeru Dua, Shruti Bhosale, Vedanuj Goswami, James Cross, Mike Lewis, Angela Fan

Multi-task learning with an unbalanced data distribution skews model learning towards high resource tasks, especially when model capacity is fixed and fully shared across all tasks.

Machine Translation Multi-Task Learning +1

Paper
Add Code

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

no code implementations • ACL 2022 • Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman

Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible.

Machine Translation Translation

Paper
Add Code

Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications

no code implementations • EMNLP 2021 • Shuo Sun, Ahmed El-Kishky, Vishrav Chaudhary, James Cross, Francisco Guzmán, Lucia Specia

Sentence-level Quality estimation (QE) of machine translation is traditionally formulated as a regression task, and the performance of QE models is typically measured by Pearson correlation with human labels.

Machine Translation Model Compression +3

Paper
Add Code

Facebook AI WMT21 News Translation Task Submission

no code implementations • 6 Aug 2021 • Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

We describe Facebook's multilingual model submission to the WMT2021 shared task on news translation.

Translation

Paper
Add Code

On the Evaluation of Machine Translation for Terminology Consistency

1 code implementation • 22 Jun 2021 • Md Mahfuz ibn Alam, Antonios Anastasopoulos, Laurent Besacier, James Cross, Matthias Gallé, Philipp Koehn, Vassilina Nikoulina

As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies.

Domain Adaptation Machine Translation +2

Paper
Code

XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment

no code implementations • EMNLP 2021 • Ahmed El-Kishky, Adithya Renduchintala, James Cross, Francisco Guzmán, Philipp Koehn

Cross-lingual named-entity lexica are an important resource to multilingual NLP tasks such as machine translation and cross-lingual wikification.

Machine Translation Multilingual NLP +2

Paper
Add Code

Improving Zero-Shot Translation by Disentangling Positional Information

1 code implementation • ACL 2021 • Danni Liu, Jan Niehues, James Cross, Francisco Guzmán, Xian Li

The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training.

Machine Translation Translation

Paper
Code

Learn to Talk via Proactive Knowledge Transfer

no code implementations • 23 Aug 2020 • Qing Sun, James Cross

In this paper, we provide an in-depth analysis of KL-divergence minimization in Forward and Backward orders, which shows that learners are reinforced via on-policy learning in Backward.

Knowledge Distillation Machine Translation +2

Paper
Add Code

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

2 code implementations • ICLR 2021 • Jungo Kasai, Nikolaos Pappas, Hao Peng, James Cross, Noah A. Smith

We show that the speed disadvantage for autoregressive baselines compared to non-autoregressive methods has been overestimated in three aspects: suboptimal layer allocation, insufficient speed measurement, and lack of knowledge distillation.

Decoder Knowledge Distillation +2

Paper
Code

Non-Autoregressive Machine Translation with Disentangled Context Transformer

1 code implementation • 15 Jan 2020 • Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens.

Machine Translation Sentence +1

Paper
Code

Monotonic Multihead Attention

3 code implementations • ICLR 2020 • Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu

Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence.

Decoder Machine Translation +1

29,318

Paper
Code

Proactive Sequence Generator via Knowledge Acquisition

no code implementations • 25 Sep 2019 • Qing Sun, James Cross, Dmitriy Genzel

Sequence-to-sequence models such as transformers, which are now being used in a wide variety of NLP tasks, typically need to have very high capacity in order to perform well.

Knowledge Distillation Sentence

Paper
Add Code

Simple Fusion: Return of the Language Model

1 code implementation • WS 2018 • Felix Stahlberg, James Cross, Veselin Stoyanov

Neural Machine Translation (NMT) typically leverages monolingual data in training through backtranslation.

Language Modelling Machine Translation +3

Paper
Code

Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles

1 code implementation • EMNLP 2016 • James Cross, Liang Huang

Parsing accuracy using efficient greedy transition systems has improved dramatically in recent years thanks to neural networks.

Constituency Parsing Dependency Parsing +1

Paper
Code

Incremental Parsing with Minimal Features Using Bi-Directional LSTM

no code implementations • ACL 2016 • James Cross, Liang Huang

Recently, neural network approaches for parsing have largely automated the combination of individual features, but still rely on (often a larger number of) atomic features created from human linguistic intuition, and potentially omitting important global context.

Binarization Constituency Parsing +2

Paper
Add Code

Good, Better, Best: Choosing Word Embedding Context

no code implementations • 19 Nov 2015 • James Cross, Bing Xiang, Bo-Wen Zhou

We propose two methods of learning vector representations of words and phrases that each combine sentence context with structural features extracted from dependency trees.

Sentence

Paper
Add Code

Optimal Incremental Parsing via Best-First Dynamic Programming

no code implementations • EMNLP 2013 • Kai Zhao, James Cross, Liang Huang

Dependency Parsing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.