Search Results for author: Matt Post

Found 68 papers, 21 papers with code

Automatic Construction of Morphologically Motivated Translation Models for Highly Inflected, Low-Resource Languages

1 code implementation AMTA 2016 John Hewitt, Matt Post, David Yarowsky

Statistical Machine Translation (SMT) of highly inflected, low-resource languages suffers from the problem of low bitext availability, which is exacerbated by large inflectional paradigms.

Machine Translation Translation

ParBLEU: Augmenting Metrics with Automatic Paraphrases for the WMT’20 Metrics Shared Task

no code implementations WMT (EMNLP) 2020 Rachel Bawden, Biao Zhang, Andre Tättar, Matt Post

We describe parBLEU, parCHRF++, and parESIM, which augment baseline metrics with automatically generated paraphrases produced by PRISM (Thompson and Post, 2020a), a multilingual neural machine translation system.

Machine Translation Translation

On the Evaluation of Machine Translation n-best Lists

no code implementations EMNLP (Eval4NLP) 2020 Jacob Bremerman, Huda Khayrallah, Douglas Oard, Matt Post

The first and principal contribution is an evaluation measure that characterizes the translation quality of an entire n-best list by asking whether many of the valid translations are placed near the top of the list.

Machine Translation Translation

SOTASTREAM: A Streaming Approach to Machine Translation Training

1 code implementation14 Aug 2023 Matt Post, Thamme Gowda, Roman Grundkiewicz, Huda Khayrallah, Rohit Jain, Marcin Junczys-Dowmunt

Many machine translation toolkits make use of a data preparation step wherein raw data is transformed into a tensor format that can be used directly by the trainer.

Machine Translation Management +2

Do GPTs Produce Less Literal Translations?

1 code implementation26 May 2023 Vikas Raunak, Arul Menezes, Matt Post, Hany Hassan Awadalla

On the task of Machine Translation (MT), multiple works have investigated few-shot prompting mechanisms to elicit better translations from LLMs.

Machine Translation NMT +3

Escaping the sentence-level paradigm in machine translation

no code implementations25 Apr 2023 Matt Post, Marcin Junczys-Dowmunt

It is well-known that document context is vital for resolving a range of translation ambiguities, and in fact the document setting is the most natural setting for nearly all translation.

Machine Translation Translation

Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models

no code implementations19 Nov 2022 Vikas Raunak, Matt Post, Arul Menezes

More concretely, while the associated utility and methods of interacting with generative models have expanded, a similar expansion has not been observed in their evaluation practices.

Domain Generalization

Additive Interventions Yield Robust Multi-Domain Machine Translation Models

no code implementations23 Oct 2022 Elijah Rippeth, Matt Post

Additive interventions are a recently-proposed mechanism for controlling target-side attributes in neural machine translation.

Machine Translation TAG +1

SALTED: A Framework for SAlient Long-Tail Translation Error Detection

no code implementations20 May 2022 Vikas Raunak, Matt Post, Arul Menezes

Traditional machine translation (MT) metrics provide an average measure of translation quality that is insensitive to the long tail of behavioral problems in MT.

Machine Translation NMT +1

Levenshtein Training for Word-level Quality Estimation

1 code implementation EMNLP 2021 Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Philipp Koehn

We propose a novel scheme to use the Levenshtein Transformer to perform the task of word-level quality estimation.

Transfer Learning Translation

Robust Open-Vocabulary Translation from Visual Text Representations

1 code implementation EMNLP 2021 Elizabeth Salesky, David Etter, Matt Post

Machine translation models have discrete vocabularies and commonly use subword segmentation techniques to achieve an 'open vocabulary.'

Machine Translation Translation

The Multilingual TEDx Corpus for Speech Recognition and Translation

no code implementations2 Feb 2021 Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.

speech-recognition Speech Recognition +1

Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity

1 code implementation WMT (EMNLP) 2020 Brian Thompson, Matt Post

Recent work has shown that a multilingual neural machine translation (NMT) model can be used to judge how well a sentence paraphrases another sentence in the same language (Thompson and Post, 2020); however, attempting to generate paraphrases from such a model using standard beam search produces trivial copies or near copies.

Machine Translation NMT +4

The JHU Submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education

no code implementations WS 2020 Huda Khayrallah, Jacob Bremerman, Arya D. McCarthy, Kenton Murray, Winston Wu, Matt Post

This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (STAPLE).

Machine Translation Translation

Benchmarking Neural and Statistical Machine Translation on Low-Resource African Languages

no code implementations LREC 2020 Kevin Duh, Paul McNamee, Matt Post, Brian Thompson

In this study, we benchmark state of the art statistical and neural machine translation systems on two African languages which do not have large amounts of resources: Somali and Swahili.

Benchmarking Machine Translation +2

Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing

1 code implementation EMNLP 2020 Brian Thompson, Matt Post

We frame the task of machine translation evaluation as one of scoring machine translation output with a sequence-to-sequence paraphraser, conditioned on a human reference.

Machine Translation NMT +1

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

1 code implementation EMNLP 2020 Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Many valid translations exist for a given sentence, yet machine translation (MT) is trained with a single reference translation, exacerbating data sparsity in low-resource settings.

Machine Translation Translation

A Discriminative Neural Model for Cross-Lingual Word Alignment

no code implementations IJCNLP 2019 Elias Stengel-Eskin, Tzu-Ray Su, Matt Post, Benjamin Van Durme

We introduce a novel discriminative word alignment model, which we integrate into a Transformer-based machine translation model.

Machine Translation NER +2

JHU 2019 Robustness Task System Description

no code implementations WS 2019 Matt Post, Kevin Duh

We describe the JHU submissions to the French{--}English, Japanese{--}English, and English{--}Japanese Robustness Task at WMT 2019.

Improved Lexically Constrained Decoding for Translation and Monolingual Rewriting

1 code implementation NAACL 2019 J. Edward Hu, Huda Khayrallah, Ryan Culkin, Patrick Xia, Tongfei Chen, Matt Post, Benjamin Van Durme

Lexically-constrained sequence decoding allows for explicit positive or negative phrase-based constraints to be placed on target output strings in generation tasks such as machine translation or monolingual text rewriting.

Data Augmentation Machine Translation +3

A Call for Clarity in Reporting BLEU Scores

2 code implementations WS 2018 Matt Post

The field of machine translation faces an under-recognized problem because of inconsistency in the reporting of scores from its dominant metric.

Machine Translation Translation

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

no code implementations NAACL 2018 Matt Post, David Vilar

The end-to-end nature of neural machine translation (NMT) removes many ways of manually guiding the translation process that were available in older paradigms.

Machine Translation NMT +1

Sockeye: A Toolkit for Neural Machine Translation

16 code implementations15 Dec 2017 Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, Matt Post

Written in Python and built on MXNet, the toolkit offers scalable training and inference for the three most prominent encoder-decoder architectures: attentional recurrent neural networks, self-attentional transformers, and fully convolutional networks.

Machine Translation NMT +1

Error-repair Dependency Parsing for Ungrammatical Texts

1 code implementation ACL 2017 Keisuke Sakaguchi, Matt Post, Benjamin Van Durme

We propose a new dependency parsing scheme which jointly parses a sentence and repairs grammatical errors by extending the non-directional transition-based formalism of Goldberg and Elhadad (2010) with three additional actions: SUBSTITUTE, DELETE, INSERT.

Dependency Parsing

Using of heterogeneous corpora for training of an ASR system

no code implementations1 Jun 2017 Jan Trmal, Gaurav Kumar, Vimal Manohar, Sanjeev Khudanpur, Matt Post, Paul McNamee

The paper summarizes the development of the LVCSR system built as a part of the Pashto speech-translation system at the SCALE (Summer Camp for Applied Language Exploration) 2015 workshop on "Speech-to-text-translation for low-resource languages".

speech-recognition Speech Recognition +2

Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network

1 code implementation7 Aug 2016 Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme

Inspired by the findings from the Cmabrigde Uinervtisy effect, we propose a word recognition model based on a semi-character level recurrent neural network (scRNN).

Spelling Correction

GLEU Without Tuning

1 code implementation9 May 2016 Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault

The GLEU metric was proposed for evaluating grammatical error corrections using n-gram overlap with a set of reference sentences, as opposed to precision/recall of specific annotated errors (Napoles et al., 2015).

Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality

1 code implementation TACL 2016 Keisuke Sakaguchi, Courtney Napoles, Matt Post, Joel Tetreault

The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics.

Grammatical Error Correction

A Wikipedia-based Corpus for Contextualized Machine Translation

no code implementations LREC 2014 Jennifer Drexler, Pushpendre Rastogi, Jacqueline Aguilar, Benjamin Van Durme, Matt Post

We describe a corpus for target-contextualized machine translation (MT), where the task is to improve the translation of source documents using language models built over presumably related documents in the target language.

Domain Adaptation Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.