Search Results for author: Huda Khayrallah

Found 25 papers, 9 papers with code

On the Evaluation of Machine Translation n-best Lists

no code implementations EMNLP (Eval4NLP) 2020 Jacob Bremerman, Huda Khayrallah, Douglas Oard, Matt Post

The first and principal contribution is an evaluation measure that characterizes the translation quality of an entire n-best list by asking whether many of the valid translations are placed near the top of the list.

Machine Translation Translation +1

On-the-Fly Fusion of Large Language Models and Machine Translation

no code implementations14 Nov 2023 Hieu Hoang, Huda Khayrallah, Marcin Junczys-Dowmunt

We propose the on-the-fly ensembling of a machine translation model with an LLM, prompted on the same task and input.

In-Context Learning Machine Translation +2

SOTASTREAM: A Streaming Approach to Machine Translation Training

1 code implementation14 Aug 2023 Matt Post, Thamme Gowda, Roman Grundkiewicz, Huda Khayrallah, Rohit Jain, Marcin Junczys-Dowmunt

Many machine translation toolkits make use of a data preparation step wherein raw data is transformed into a tensor format that can be used directly by the trainer.

Machine Translation Management +2

Measuring the `I don't know' Problem through the Lens of Gricean Quantity

no code implementations NAACL 2021 Huda Khayrallah, João Sedoc

We consider the intrinsic evaluation of neural generative dialog models through the lens of Grice's Maxims of Conversation (1975).

The JHU Submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education

no code implementations WS 2020 Huda Khayrallah, Jacob Bremerman, Arya D. McCarthy, Kenton Murray, Winston Wu, Matt Post

This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

1 code implementation EMNLP 2020 Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Many valid translations exist for a given sentence, yet machine translation (MT) is trained with a single reference translation, exacerbating data sparsity in low-resource settings.

Machine Translation Sentence +2

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

1 code implementation NAACL 2019 J. Edward Hu, Huda Khayrallah, Ryan Culkin, Patrick Xia, Tongfei Chen, Matt Post, Benjamin Van Durme

Lexically-constrained sequence decoding allows for explicit positive or negative phrase-based constraints to be placed on target output strings in generation tasks such as machine translation or monolingual text rewriting.

Data Augmentation Machine Translation +3

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

no code implementations WS 2018 Philipp Koehn, Huda Khayrallah, Kenneth Heafield, Mikel L. Forcada

We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1{\%} and 10{\%} of high-quality data to be used to train machine translation systems.

Machine Translation Outlier Detection +2

Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

1 code implementation WS 2018 Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation.

Domain Adaptation Machine Translation +1

Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation

1 code implementation WS 2018 Huda Khayrallah, Brian Thompson, Kevin Duh, Philipp Koehn

Supervised domain adaptation{---}where a large generic corpus and a smaller in-domain corpus are both available for training{---}is a challenge for neural machine translation (NMT).

Domain Adaptation Machine Translation +2

On the Impact of Various Types of Noise on Neural Machine Translation

1 code implementation WS 2018 Huda Khayrallah, Philipp Koehn

We examine how various types of noise in the parallel training data impact the quality of neural machine translation systems.

Machine Translation Sentence +1

Paradigm Completion for Derivational Morphology

no code implementations EMNLP 2017 Ryan Cotterell, Ekaterina Vylomova, Huda Khayrallah, Christo Kirov, David Yarowsky

The generation of complex derived word forms has been an overlooked problem in NLP; we fill this gap by applying neural sequence-to-sequence models to the task.

Deep Generalized Canonical Correlation Analysis

3 code implementations WS 2019 Adrian Benton, Huda Khayrallah, Biman Gujral, Dee Ann Reisinger, Sheng Zhang, Raman Arora

We present Deep Generalized Canonical Correlation Analysis (DGCCA) -- a method for learning nonlinear transformations of arbitrarily many views of data, such that the resulting transformations are maximally informative of each other.

Representation Learning Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.