Search Results for author: Philipp Koehn

Found 145 papers, 30 papers with code

Findings of the WMT Shared Task on Machine Translation Using Terminologies

no code implementations • WMT (EMNLP) 2021 • Md Mahfuz ibn Alam, Ivana Kvapilíková, Antonios Anastasopoulos, Laurent Besacier, Georgiana Dinu, Marcello Federico, Matthias Gallé, Kweonwoo Jung, Philipp Koehn, Vassilina Nikoulina

Language domains that require very careful use of terminology are abundant and reflect a significant part of the translation industry.

Machine Translation Translation

Paper
Add Code

Statistical Power and Translationese in Machine Translation Evaluation

no code implementations • EMNLP 2020 • Yvette Graham, Barry Haddow, Philipp Koehn

In addition, we provide a re-evaluation of a past machine translation evaluation claiming human-parity of MT.

Machine Translation Translation

Paper
Add Code

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

Paper
Add Code

Facebook AI’s WMT21 News Translation Task Submission

1 code implementation • WMT (EMNLP) 2021 • Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

We describe Facebook’s multilingual model submission to the WMT2021 shared task on news translation.

Translation

29,219

Paper
Code

An Alignment-Based Approach to Semi-Supervised Bilingual Lexicon Induction with Small Parallel Corpora

1 code implementation • MTSummit 2021 • Kelly Marchisio, Philipp Koehn, Conghao Xiong

Aimed at generating a seed lexicon for use in downstream natural language tasks and unsupervised methods for bilingual lexicon induction have received much attention in the academic literature recently.

Bilingual Lexicon Induction Translation

Paper
Code

Learning Curricula for Multilingual Neural Machine Translation Training

no code implementations • MTSummit 2021 • Gaurav Kumar, Philipp Koehn, Sanjeev Khudanpur

Low-resource Multilingual Neural Machine Translation (MNMT) is typically tasked with improving the translation performance on one or more language pairs with the aid of high-resource language pairs.

Machine Translation Translation

Paper
Add Code

Machine Translation Quality and Post-Editor Productivity

no code implementations • AMTA 2016 • Marina Sanchez-Torron, Philipp Koehn

We assessed how different machine translation (MT) systems affect the post-editing (PE) process and product of professional English–Spanish translators.

Machine Translation Translation

Paper
Add Code

Embedding-Enhanced GIZA++: Improving Low-Resource Word Alignment Using Embeddings

no code implementations • AMTA 2022 • Kelly Marchisio, Conghao Xiong, Philipp Koehn

A popular natural language processing task decades ago, word alignment has been dominated until recently by GIZA++, a statistical method based on the 30-year-old IBM models.

Machine Translation Translation +1

Paper
Add Code

A Neural Verb Lexicon Model with Source-side Syntactic Context for String-to-Tree Machine Translation

no code implementations • IWSLT 2016 • Maria Nădejde, Alexandra Birch, Philipp Koehn

String-to-tree MT systems translate verbs without lexical or syntactic context on the source side and with limited target-side context.

Machine Translation Re-Ranking +2

Paper
Add Code

Dual Conditional Cross Entropy Scores and LASER Similarity Scores for the WMT20 Parallel Corpus Filtering Shared Task

no code implementations • WMT (EMNLP) 2020 • Felicia Koerner, Philipp Koehn

This paper describes our submission to the WMT20 Parallel Corpus Filtering and Alignment for Low-Resource Conditions Shared Task.

Paper
Add Code

An exploratory approach to the Parallel Corpus Filtering shared task WMT20

no code implementations • WMT (EMNLP) 2020 • Ankur Kejriwal, Philipp Koehn

In this document we describe our submission to the parallel corpus filtering task using multilingual word embedding, language models and an ensemble of pre and post filtering rules.

Paper
Add Code

Findings of the WMT 2020 Shared Task on Parallel Corpus Filtering and Alignment

no code implementations • WMT (EMNLP) 2020 • Philipp Koehn, Vishrav Chaudhary, Ahmed El-Kishky, Naman Goyal, Peng-Jen Chen, Francisco Guzmán

Following two preceding WMT Shared Task on Parallel Corpus Filtering (Koehn et al., 2018, 2019), we posed again the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting the highest-quality data to be used to train ma-chine translation systems.

Sentence Translation

Paper
Add Code

Findings of the WMT 2020 Shared Task on Machine Translation Robustness

no code implementations • WMT (EMNLP) 2020 • Lucia Specia, Zhenhao Li, Juan Pino, Vishrav Chaudhary, Francisco Guzmán, Graham Neubig, Nadir Durrani, Yonatan Belinkov, Philipp Koehn, Hassan Sajjad, Paul Michel, Xian Li

We report the findings of the second edition of the shared task on improving robustness in Machine Translation (MT).

Machine Translation Translation

Paper
Add Code

Translation of Unknown Words in Low Resource Languages

no code implementations • AMTA 2016 • Biman Gujral, Huda Khayrallah, Philipp Koehn

Translation

Paper
Add Code

Neural Interactive Translation Prediction

no code implementations • AMTA 2016 • Rebecca Knowles, Philipp Koehn

We present an interactive translation prediction method based on neural machine translation.

Machine Translation Translation

Paper
Add Code

Pointer-Generator Networks for Low-Resource Machine Translation: Don't Copy That!

no code implementations • 16 Mar 2024 • Niyati Bafna, Philipp Koehn, David Yarowsky

While Transformer-based neural machine translation (NMT) is very effective in high-resource settings, many languages lack the necessary large parallel corpora to benefit from it.

Machine Translation NMT

Paper
Add Code

Streaming Sequence Transduction through Dynamic Compression

1 code implementation • 2 Feb 2024 • Weiting Tan, Yunmo Chen, Tongfei Chen, Guanghui Qin, Haoran Xu, Heidi C. Zhang, Benjamin Van Durme, Philipp Koehn

We introduce STAR (Stream Transduction with Anchor Representations), a novel Transformer-based model designed for efficient sequence-to-sequence transduction over streams.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts

no code implementations • 23 Jan 2024 • Lingfeng Shen, Weiting Tan, Sihao Chen, Yunmo Chen, Jingyu Zhang, Haoran Xu, Boyuan Zheng, Philipp Koehn, Daniel Khashabi

As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research.

Paper
Add Code

Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs

no code implementations • 6 Nov 2023 • Longyue Wang, Zhaopeng Tu, Yan Gu, Siyou Liu, Dian Yu, Qingsong Ma, Chenyang Lyu, Liting Zhou, Chao-Hong Liu, Yufeng Ma, WeiYu Chen, Yvette Graham, Bonnie Webber, Philipp Koehn, Andy Way, Yulin Yuan, Shuming Shi

To foster progress in this domain, we hold a new shared task at WMT 2023, the first edition of the Discourse-Level Literary Translation.

Machine Translation Translation

Paper
Add Code

Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles

no code implementations • 4 Nov 2023 • Weiting Tan, Haoran Xu, Lingfeng Shen, Shuyue Stella Li, Kenton Murray, Philipp Koehn, Benjamin Van Durme, Yunmo Chen

Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning.

In-Context Learning Machine Translation +1

Paper
Add Code

Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models

no code implementations • 2 Oct 2023 • Tianjian Li, Haoran Xu, Philipp Koehn, Daniel Khashabi, Kenton Murray

Text generation models are notoriously vulnerable to errors in the training data.

Language Modelling Machine Translation +3

Paper
Add Code

Condensing Multilingual Knowledge with Lightweight Language-Specific Modules

1 code implementation • 23 May 2023 • Haoran Xu, Weiting Tan, Shuyue Stella Li, Yunmo Chen, Benjamin Van Durme, Philipp Koehn, Kenton Murray

Incorporating language-specific (LS) modules is a proven method to boost performance in multilingual machine translation.

Machine Translation Translation

Paper
Code

Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer

no code implementations • 23 May 2023 • Elizabeth Salesky, Neha Verma, Philipp Koehn, Matt Post

We introduce and demonstrate how to effectively train multilingual machine translation models with pixel representations.

Cross-Lingual Transfer Machine Translation +1

Paper
Add Code

Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport

no code implementations • 25 Oct 2022 • Kelly Marchisio, Ali Saad-Eldin, Kevin Duh, Carey Priebe, Philipp Koehn

Bilingual lexicons form a critical component of various natural language processing applications, including unsupervised and semisupervised machine translation and crosslingual information retrieval.

Bilingual Lexicon Induction Graph Matching +3

Paper
Add Code

IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces

1 code implementation • 11 Oct 2022 • Kelly Marchisio, Neha Verma, Kevin Duh, Philipp Koehn

The ability to extract high-quality translation dictionaries from monolingual word embedding spaces depends critically on the geometric similarity of the spaces -- their degree of "isomorphism."

Bilingual Lexicon Induction Translation

Paper
Code

Multilingual Representation Distillation with Contrastive Learning

no code implementations • 10 Oct 2022 • Weiting Tan, Kevin Heffernan, Holger Schwenk, Philipp Koehn

Multilingual sentence representations from large models encode semantic information from two or more languages and can be used for different cross-lingual information retrieval and matching tasks.

Contrastive Learning Cross-Lingual Information Retrieval +2

Paper
Add Code

Bitext Mining for Low-Resource Languages via Contrastive Learning

1 code implementation • 23 Aug 2022 • Weiting Tan, Philipp Koehn

Mining high-quality bitexts for low-resource languages is challenging.

Contrastive Learning Sentence

Paper
Code

No Language Left Behind: Scaling Human-Centered Machine Translation

7 code implementations • Meta AI 2022 • NLLB team, Marta R. Costa-jussà, James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, Janice Lam, Daniel Licht, Jean Maillard, Anna Sun, Skyler Wang, Guillaume Wenzek, Al Youngblood, Bapi Akula, Loic Barrault, Gabriel Mejia Gonzalez, Prangthip Hansanti, John Hoffman, Semarley Jarrett, Kaushik Ram Sadagopan, Dirk Rowe, Shannon Spruit, Chau Tran, Pierre Andrews, Necip Fazil Ayan, Shruti Bhosale, Sergey Edunov, Angela Fan, Cynthia Gao, Vedanuj Goswami, Francisco Guzmán, Philipp Koehn, Alexandre Mourachko, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Jeff Wang

Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today.

Ranked #1 on Machine Translation on IWSLT2017 French-English (SacreBLEU metric)

Machine Translation Translation

29,224

Paper
Code

The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains

1 code implementation • 23 May 2022 • Haoran Xu, Philipp Koehn, Kenton Murray

We first highlight the large sensitivity (contribution) gap among high-sensitivity and low-sensitivity parameters and show that the model generalization performance can be significantly improved after balancing the contribution of all parameters.

Machine Translation Natural Language Understanding +2

Paper
Code

Consistent Human Evaluation of Machine Translation across Language Pairs

no code implementations • AMTA 2022 • Daniel Licht, Cynthia Gao, Janice Lam, Francisco Guzman, Mona Diab, Philipp Koehn

Obtaining meaningful quality scores for machine translation systems through human evaluation remains a challenge given the high variability between human evaluators, partly due to subjective expectations for translation quality for different language pairs.

Machine Translation Translation

Paper
Add Code

Learn To Remember: Transformer with Recurrent Memory for Document-Level Machine Translation

no code implementations • Findings (NAACL) 2022 • Yukun Feng, Feng Li, Ziang Song, Boyuan Zheng, Philipp Koehn

We conduct experiments on three popular datasets for document-level machine translation and our model has an average improvement of 0. 91 s-BLEU over the sentence-level baseline.

Document Level Machine Translation Machine Translation +2

Paper
Add Code

Data Selection Curriculum for Neural Machine Translation

no code implementations • 25 Mar 2022 • Tasnim Mohiuddin, Philipp Koehn, Vishrav Chaudhary, James Cross, Shruti Bhosale, Shafiq Joty

In this work, we introduce a two-stage curriculum training framework for NMT where we fine-tune a base NMT model on subsets of data, selected by both deterministic scoring using pre-trained methods and online scoring that considers prediction scores of the emerging NMT model.

Machine Translation NMT +1

Paper
Add Code

Alternative Input Signals Ease Transfer in Multilingual Machine Translation

no code implementations • ACL 2022 • Simeng Sun, Angela Fan, James Cross, Vishrav Chaudhary, Chau Tran, Philipp Koehn, Francisco Guzman

Further, we find that incorporating alternative inputs via self-ensemble can be particularly effective when training set is small, leading to +5 BLEU when only 5% of the total training data is accessible.

Machine Translation Translation

Paper
Add Code

Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation

1 code implementation • AMTA 2022 • Weiting Tan, Shuoyang Ding, Huda Khayrallah, Philipp Koehn

Neural Machine Translation (NMT) models are known to suffer from noisy inputs.

Data Augmentation Machine Translation +4

Paper
Code

Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation

no code implementations • ICLR 2022 • Xuan-Phi Nguyen, Hongyu Gong, Yun Tang, Changhan Wang, Philipp Koehn, Shafiq Joty

Modern unsupervised machine translation systems mostly train their models by generating synthetic parallel training data from large unlabeled monolingual corpora of different languages through various means, such as iterative back-translation.

Clustering Translation +1

Paper
Add Code

An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces

1 code implementation • Findings (EMNLP) 2021 • Kelly Marchisio, Youngser Park, Ali Saad-Eldin, Anton Alyakin, Kevin Duh, Carey Priebe, Philipp Koehn

Alternatively, word embeddings may be understood as nodes in a weighted graph.

Bilingual Lexicon Induction Graph Matching +1

Paper
Code

The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2021 • Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Christian Federmann, Philipp Koehn

This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task.

Data Augmentation Task 2 +1

Paper
Add Code

Levenshtein Training for Word-level Quality Estimation

1 code implementation • EMNLP 2021 • Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Philipp Koehn

We propose a novel scheme to use the Levenshtein Transformer to perform the task of word-level quality estimation.

Transfer Learning Translation

Paper
Code

Facebook AI WMT21 News Translation Task Submission

no code implementations • 6 Aug 2021 • Chau Tran, Shruti Bhosale, James Cross, Philipp Koehn, Sergey Edunov, Angela Fan

We describe Facebook's multilingual model submission to the WMT2021 shared task on news translation.

Translation

Paper
Add Code

Cross-Lingual BERT Contextual Embedding Space Mapping with Isotropic and Isometric Conditions

1 code implementation • 19 Jul 2021 • Haoran Xu, Philipp Koehn

Typically, a linearly orthogonal transformation mapping is learned by aligning static type-level embeddings to build a shared semantic space.

Paper
Code

On the Evaluation of Machine Translation for Terminology Consistency

1 code implementation • 22 Jun 2021 • Md Mahfuz ibn Alam, Antonios Anastasopoulos, Laurent Besacier, James Cross, Matthias Gallé, Philipp Koehn, Vassilina Nikoulina

As neural machine translation (NMT) systems become an important part of professional translator pipelines, a growing body of work focuses on combining NMT with terminologies.

Domain Adaptation Machine Translation +2

Paper
Code

Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data

1 code implementation • ACL 2021 • Wei-Jen Ko, Ahmed El-Kishky, Adithya Renduchintala, Vishrav Chaudhary, Naman Goyal, Francisco Guzmán, Pascale Fung, Philipp Koehn, Mona Diab

The scarcity of parallel data is a major obstacle for training high-quality machine translation systems for low-resource languages.

Denoising Machine Translation +2

Paper
Code

Embedding-Enhanced Giza++: Improving Alignment in Low- and High- Resource Scenarios Using Embedding Space Geometry

1 code implementation • 18 Apr 2021 • Kelly Marchisio, Conghao Xiong, Philipp Koehn

In the lowest-resource setting, we outperform GIZA++ by 8. 5, 10. 9, and 12 AER for Ro-En, De-En, and En-Fr, respectively.

Machine Translation Translation +1

Paper
Code

XLEnt: Mining a Large Cross-lingual Entity Dataset with Lexical-Semantic-Phonetic Word Alignment

no code implementations • EMNLP 2021 • Ahmed El-Kishky, Adithya Renduchintala, James Cross, Francisco Guzmán, Philipp Koehn

Cross-lingual named-entity lexica are an important resource to multilingual NLP tasks such as machine translation and cross-lingual wikification.

Machine Translation Multilingual NLP +2

Paper
Add Code

Evaluating Saliency Methods for Neural Language Models

1 code implementation • NAACL 2021 • Shuoyang Ding, Philipp Koehn

Saliency methods are widely used to interpret neural network predictions, but different variants of saliency methods often disagree even on the interpretations of the same prediction made by the same model.

Sentence

Paper
Code

Learning Policies for Multilingual Training of Neural Machine Translation Systems

no code implementations • 11 Mar 2021 • Gaurav Kumar, Philipp Koehn, Sanjeev Khudanpur

Low-resource Multilingual Neural Machine Translation (MNMT) is typically tasked with improving the translation performance on one or more language pairs with the aid of high-resource language pairs.

Machine Translation Translation

Paper
Add Code

Learning Feature Weights using Reward Modeling for Denoising Parallel Corpora

no code implementations • WMT (EMNLP) 2021 • Gaurav Kumar, Philipp Koehn, Sanjeev Khudanpur

These feature weights which are optimized directly for the task of improving translation performance, are used to score and filter sentences in the noisy corpora more effectively.

Denoising Language Modelling +4

Paper
Add Code

Zero-Shot Cross-Lingual Dependency Parsing through Contextual Embedding Transformation

1 code implementation • EACL (AdaptNLP) 2021 • Haoran Xu, Philipp Koehn

Linear embedding transformation has been shown to be effective for zero-shot cross-lingual transfer tasks and achieve surprisingly promising results.

Dependency Parsing Translation +1

Paper
Code

Findings of the 2020 Conference on Machine Translation (WMT20)

no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri

In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.

Machine Translation Translation

Paper
Add Code

SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Xutai Ma, Juan Pino, Philipp Koehn

Simultaneous text translation and end-to-end speech translation have recently made great progress but little work has combined these tasks together.

Translation

29,225

Paper
Code

Streaming Simultaneous Speech Translation with Augmented Memory Transformer

no code implementations • 30 Oct 2020 • Xutai Ma, Yongqiang Wang, Mohammad Javad Dousti, Philipp Koehn, Juan Pino

Transformer-based models have achieved state-of-the-art performance on speech translation tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

TICO-19: the Translation Initiative for Covid-19

no code implementations • EMNLP (NLP-COVID19) 2020 • Antonios Anastasopoulos, Alessandro Cattelan, Zi-Yi Dou, Marcello Federico, Christian Federman, Dmitriy Genzel, Francisco Guzmán, Junjie Hu, Macduff Hughes, Philipp Koehn, Rosie Lazar, Will Lewis, Graham Neubig, Mengmeng Niu, Alp Öktem, Eric Paquin, Grace Tang, Sylwia Tur

Further, the team is converting the test and development data into translation memories (TMXs) that can be used by localizers from and to any of the languages.

Translation

Paper
Add Code

ParaCrawl: Web-Scale Acquisition of Parallel Corpora

2 code implementations • ACL 2020 • Marta Ba{\~n}{\'o}n, Pin-zhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Espl{\`a}-Gomis, Mikel L. Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema Ram{\'\i}rez-S{\'a}nchez, Elsa Sarr{\'\i}as, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, Jaume Zaragoza

We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software.

Machine Translation Parallel Corpus Mining +2

1,169

Paper
Code

Simulated Multiple Reference Training Improves Low-Resource Machine Translation

1 code implementation • EMNLP 2020 • Huda Khayrallah, Brian Thompson, Matt Post, Philipp Koehn

Many valid translations exist for a given sentence, yet machine translation (MT) is trained with a single reference translation, exacerbating data sparsity in low-resource settings.

Machine Translation Sentence +2

Paper
Code

Exploiting Sentence Order in Document Alignment

1 code implementation • EMNLP 2020 • Brian Thompson, Philipp Koehn

We present a simple document alignment method that incorporates sentence order information in both candidate generation and candidate re-scoring.

Sentence

148

Paper
Code

When Does Unsupervised Machine Translation Work?

no code implementations • WMT (EMNLP) 2020 • Kelly Marchisio, Kevin Duh, Philipp Koehn

We additionally find that unsupervised MT performance declines when source and target languages use different scripts, and observe very poor performance on authentic low-resource language pairs.

Translation Unsupervised Machine Translation

Paper
Add Code

CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs

no code implementations • EMNLP 2020 • Ahmed El-Kishky, Vishrav Chaudhary, Francisco Guzman, Philipp Koehn

We mine sixty-eight snapshots of the Common Crawl corpus and identify web document pairs that are translations of each other.

Paper
Add Code

Spelling-Aware Construction of Macaronic Texts for Teaching Foreign-Language Vocabulary

no code implementations • IJCNLP 2019 • Adithya Renduchintala, Philipp Koehn, Jason Eisner

We present a machine foreign-language teacher that modifies text in a student{'}s native language (L1) by replacing some word tokens with glosses in a foreign language (L2), in such a way that the student can acquire L2 vocabulary simply by reading the resulting macaronic text.

Language Modelling

Paper
Add Code

HABLex: Human Annotated Bilingual Lexicons for Experiments in Machine Translation

no code implementations • IJCNLP 2019 • Brian Thompson, Rebecca Knowles, Xuan Zhang, Huda Khayrallah, Kevin Duh, Philipp Koehn

Bilingual lexicons are valuable resources used by professional human translators.

Machine Translation Translation

Paper
Add Code

Vecalign: Improved Sentence Alignment in Linear Time and Space

no code implementations • IJCNLP 2019 • Brian Thompson, Philipp Koehn

It substantially outperforms the popular Hunalign toolkit at recovering Bible verse alignments in medium- to low-resource language pairs, and it improves downstream MT quality by 1. 7 and 1. 6 BLEU in Sinhala-English and Nepali-English, respectively, compared to the Hunalign-based Paracrawl pipeline.

Machine Translation Sentence +2

Paper
Add Code

The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali--English and Sinhala--English

1 code implementation • IJCNLP 2019 • Francisco Guzm{\'a}n, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc{'}Aurelio Ranzato

For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.

Machine Translation Translation

656

Paper
Code

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

no code implementations • WS 2019 • Philipp Koehn, Francisco Guzm{\'a}n, Vishrav Chaudhary, Juan Pino

Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2{\%} and 10{\%} of the highest-quality data to be used to train machine translation systems.

Machine Translation Sentence +1

Paper
Add Code

Controlling the Reading Level of Machine Translation Output

no code implementations • WS 2019 • Kelly Marchisio, Jialiang Guo, Cheng-I Lai, Philipp Koehn

Machine Translation Translation

Paper
Add Code

Robust Document Representations for Cross-Lingual Information Retrieval in Low-Resource Settings

no code implementations • WS 2019 • Mahsa Yarmohammadi, Xutai Ma, Sorami Hisamoto, Muhammad Rahman, Yiming Wang, Hainan Xu, Daniel Povey, Philipp Koehn, Kevin Duh

Cross-Lingual Information Retrieval Retrieval

Paper
Add Code

Johns Hopkins University Submission for WMT News Translation Task

no code implementations • WS 2019 • Kelly Marchisio, Yash Kumar Lal, Philipp Koehn

We describe the work of Johns Hopkins University for the shared task of news translation organized by the Fourth Conference on Machine Translation (2019).

Machine Translation Translation

Paper
Add Code

Findings of the 2019 Conference on Machine Translation (WMT19)

no code implementations • WS 2019 • Lo{\"\i}c Barrault, Ond{\v{r}}ej Bojar, Marta R. Costa-juss{\`a}, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias M{\"u}ller, Santanu Pal, Matt Post, Marcos Zampieri

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.

Machine Translation Translation

Paper
Add Code

Simple Construction of Mixed-Language Texts for Vocabulary Learning

no code implementations • WS 2019 • Adithya Renduchintala, Philipp Koehn, Jason Eisner

We accomplish this by modifying a cloze language model to incrementally learn new vocabulary items, and use this language model as a proxy for the word guessing and learning ability of real students.

Language Modelling

Paper
Add Code

De-Mixing Sentiment from Code-Mixed Text

no code implementations • ACL 2019 • Yash Kumar Lal, Vaibhav Kumar, Mrinal Dhar, Manish Shrivastava, Philipp Koehn

The Collective Encoder captures the overall sentiment of the sentence, while the Specific Encoder utilizes an attention mechanism in order to focus on individual sentiment-bearing sub-words.

Sentence Sentiment Analysis +1

Paper
Add Code

Findings of the First Shared Task on Machine Translation Robustness

1 code implementation • WS 2019 • Xi-An Li, Paul Michel, Antonios Anastasopoulos, Yonatan Belinkov, Nadir Durrani, Orhan Firat, Philipp Koehn, Graham Neubig, Juan Pino, Hassan Sajjad

We share the findings of the first shared task on improving robustness of Machine Translation (MT).

Machine Translation Translation

461

Paper
Code

Saliency-driven Word Alignment Interpretation for Neural Machine Translation

1 code implementation • WS 2019 • Shuoyang Ding, Hainan Xu, Philipp Koehn

Despite their original goal to jointly learn to align and translate, Neural Machine Translation (NMT) models, especially Transformer, are often perceived as not learning interpretable word alignments.

Machine Translation NMT +2

Paper
Code

Translationese in Machine Translation Evaluation

no code implementations • 24 Jun 2019 • Yvette Graham, Barry Haddow, Philipp Koehn

Finally, we provide a comprehensive check-list for future machine translation evaluation.

Machine Translation Translation

Paper
Add Code

Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

no code implementations • WS 2019 • Vishrav Chaudhary, Yuqing Tang, Francisco Guzmán, Holger Schwenk, Philipp Koehn

In this paper, we describe our submission to the WMT19 low-resource parallel corpus filtering shared task.

Sentence Sentence Embeddings

Paper
Add Code

Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation

no code implementations • NAACL 2019 • Brian Thompson, Jeremy Gwinnup, Huda Khayrallah, Kevin Duh, Philipp Koehn

Continued training is an effective method for domain adaptation in neural machine translation.

BIG-bench Machine Learning Domain Adaptation +2

Paper
Add Code

Parallelizable Stack Long Short-Term Memory

1 code implementation • WS 2019 • Shuoyang Ding, Philipp Koehn

Stack Long Short-Term Memory (StackLSTM) is useful for various applications such as parsing and string-to-tree neural machine translation, but it is also known to be notoriously difficult to parallelize for GPU training due to the fact that the computations are dependent on discrete operations.

Machine Translation Translation

Paper
Code

The FLoRes Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English

2 code implementations • 4 Feb 2019 • Francisco Guzmán, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato

For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available.

Machine Translation Translation

656

Paper
Code

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

no code implementations • WS 2018 • Philipp Koehn, Huda Khayrallah, Kenneth Heafield, Mikel L. Forcada

We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1{\%} and 10{\%} of high-quality data to be used to train machine translation systems.

Machine Translation Outlier Detection +2

Paper
Add Code

The JHU Machine Translation Systems for WMT 2018

no code implementations • WS 2018 • Philipp Koehn, Kevin Duh, Brian Thompson

We report on the efforts of the Johns Hopkins University to develop neural machine translation systems for the shared task for news translation organized around the Conference for Machine Translation (WMT) 2018.

Machine Translation Translation

Paper
Add Code

The JHU Parallel Corpus Filtering Systems for WMT 2018

no code implementations • WS 2018 • Huda Khayrallah, Hainan Xu, Philipp Koehn

This work describes our submission to the WMT18 Parallel Corpus Filtering shared task.

Language Modelling Machine Translation +2

Paper
Add Code

Findings of the 2018 Conference on Machine Translation (WMT18)

no code implementations • WS 2018 • Ond{\v{r}}ej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, Christof Monz

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018.

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Context and Copying in Neural Machine Translation

no code implementations • EMNLP 2018 • Rebecca Knowles, Philipp Koehn

In this work, we show that they learn to copy words based on both the context in which the words appear as well as features of the words themselves.

Machine Translation Translation

Paper
Add Code

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor

Machine Translation Translation

Paper
Add Code

Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation

1 code implementation • WS 2018 • Brian Thompson, Huda Khayrallah, Antonios Anastasopoulos, Arya D. McCarthy, Kevin Duh, Rebecca Marvin, Paul McNamee, Jeremy Gwinnup, Tim Anderson, Philipp Koehn

To better understand the effectiveness of continued training, we analyze the major components of a neural machine translation system (the encoder, decoder, and each embedding space) and consider each component's contribution to, and capacity for, domain adaptation.

Domain Adaptation Machine Translation +1

1,206

Paper
Code

Character-Aware Decoder for Translation into Morphologically Rich Languages

no code implementations • WS 2019 • Adithya Renduchintala, Pamela Shapiro, Kevin Duh, Philipp Koehn

Neural machine translation (NMT) systems operate primarily on words (or sub-words), ignoring lower-level patterns of morphology.

Machine Translation NMT +1

Paper
Add Code

Iterative Back-Translation for Neural Machine Translation

no code implementations • WS 2018 • Vu Cong Duy Hoang, Philipp Koehn, Gholamreza Haffari, Trevor Cohn

We present iterative back-translation, a method for generating increasingly better synthetic parallel data from monolingual data to train neural machine translation systems.

Machine Translation Translation

Paper
Add Code

Document-Level Adaptation for Neural Machine Translation

no code implementations • WS 2018 • Sachith Sri Ram Kothur, Rebecca Knowles, Philipp Koehn

It is common practice to adapt machine translation systems to novel domains, but even a well-adapted system may be able to perform better on a particular document if it were to learn from a translator{'}s corrections within the document itself.

Machine Translation Sentence +2

Paper
Add Code

Regularized Training Objective for Continued Training for Domain Adaptation in Neural Machine Translation

1 code implementation • WS 2018 • Huda Khayrallah, Brian Thompson, Kevin Duh, Philipp Koehn

Supervised domain adaptation{---}where a large generic corpus and a smaller in-domain corpus are both available for training{---}is a challenge for neural machine translation (NMT).

Domain Adaptation Machine Translation +2

Paper
Code

On the Impact of Various Types of Noise on Neural Machine Translation

1 code implementation • WS 2018 • Huda Khayrallah, Philipp Koehn

We examine how various types of noise in the parallel training data impact the quality of neural machine translation systems.

Machine Translation Sentence +1

Paper
Code

A Comparison of Machine Translation Paradigms for Use in Black-Box Fuzzy-Match Repair

no code implementations • WS 2018 • Rebecca Knowles, John Ortega, Philipp Koehn

Automatic Post-Editing Translation

Paper
Add Code

Exploring Word Sense Disambiguation Abilities of Neural Machine Translation Systems (Non-archival Extended Abstract)

no code implementations • WS 2018 • Rebecca Marvin, Philipp Koehn

Machine Translation Translation +2

Paper
Add Code

Lightweight Word-Level Confidence Estimation for Neural Interactive Translation Prediction

no code implementations • WS 2018 • Rebecca Knowles, Philipp Koehn

Automatic Post-Editing Translation

Paper
Add Code

Neural Lattice Search for Domain Adaptation in Machine Translation

no code implementations • IJCNLP 2017 • Huda Khayrallah, Gaurav Kumar, Kevin Duh, Matt Post, Philipp Koehn

Domain adaptation is a major challenge for neural machine translation (NMT).

Domain Adaptation Machine Translation +2

Paper
Add Code

CADET: Computer Assisted Discovery Extraction and Translation

no code implementations • IJCNLP 2017 • Benjamin Van Durme, Tom Lippincott, Kevin Duh, Deana Burchfield, Adam Poliak, Cash Costello, Tim Finin, Scott Miller, James Mayfield, Philipp Koehn, Craig Harman, Dawn Lawrie, Ch May, ler, Max Thomas, Annabelle Carrell, Julianne Chaloux, Tongfei Chen, Alex Comerford, Mark Dredze, Benjamin Glass, Shudong Hao, Patrick Martin, Pushpendre Rastogi, Rashmi Sankepally, Travis Wolfe, Ying-Ying Tran, Ted Zhang

It combines a multitude of analytics together with a flexible environment for customizing the workflow for different users.

Active Learning Machine Translation +1

Paper
Add Code

Neural Machine Translation

5 code implementations • 22 Sep 2017 • Philipp Koehn

Draft of textbook chapter on neural machine translation.

Ranked #1 on Machine Translation on 20NEWS (using extra training data)

Machine Translation Translation

14,878

Paper
Code

The JHU Machine Translation Systems for WMT 2017

no code implementations • WS 2017 • Shuoyang Ding, Huda Khayrallah, Philipp Koehn, Matt Post, Gaurav Kumar, Kevin Duh

Language Modelling Machine Translation +1

Paper
Add Code

Findings of the 2017 Conference on Machine Translation (WMT17)

no code implementations • WS 2017 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shu-Jian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, Marco Turchi

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Zipporah: a Fast and Scalable Data Cleaning System for Noisy Web-Crawled Parallel Corpora

no code implementations • EMNLP 2017 • Hainan Xu, Philipp Koehn

We introduce Zipporah, a fast and scalable data cleaning system.

Language Modelling Machine Translation +3

Paper
Add Code

Knowledge Tracing in Sequential Learning of Inflected Vocabulary

no code implementations • CONLL 2017 • Adithya Renduchintala, Philipp Koehn, Jason Eisner

We present a feature-rich knowledge tracing method that captures a student{'}s acquisition and retention of knowledge during a foreign language phrase learning task.

Knowledge Tracing Structured Prediction

Paper
Add Code

Six Challenges for Neural Machine Translation

no code implementations • WS 2017 • Philipp Koehn, Rebecca Knowles

We explore six challenges for neural machine translation: domain mismatch, amount of training data, rare words, long sentences, word alignment, and beam search.

Machine Translation Translation +1

Paper
Add Code

Predicting Target Language CCG Supertags Improves Neural Machine Translation

no code implementations • WS 2017 • Maria Nadejde, Siva Reddy, Rico Sennrich, Tomasz Dwojak, Marcin Junczys-Dowmunt, Philipp Koehn, Alexandra Birch

Our results on WMT data show that explicitly modeling target-syntax improves machine translation quality for German->English, a high-resource pair, and for Romanian->English, a low-resource pair and also several syntactic phenomena including prepositional phrase attachment.

Machine Translation NMT +2

Paper
Add Code

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi

Machine Translation Translation

Paper
Add Code

Modeling Selectional Preferences of Verbs and Nouns in String-to-Tree Machine Translation

no code implementations • WS 2016 • Maria N{\u{a}}dejde, Alex Birch, ra, Philipp Koehn

Language Modelling Machine Translation +1

Paper
Add Code

Quick and Reliable Document Alignment via TF/IDF-weighted Cosine Distance

no code implementations • WS 2016 • Christian Buck, Philipp Koehn

Graph Matching Machine Translation

Paper
Add Code

The JHU Machine Translation Systems for WMT 2016

no code implementations • WS 2016 • Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn, Matt Post

Language Modelling Machine Translation +1

Paper
Add Code

Findings of the WMT 2016 Bilingual Document Alignment Shared Task

no code implementations • WS 2016 • Christian Buck, Philipp Koehn

Machine Translation

Paper
Add Code

User Modeling in Language Learning with Macaronic Texts

no code implementations • ACL 2016 • Adithya Renduchintala, Rebecca Knowles, Philipp Koehn, Jason Eisner

Paper
Add Code

Creating Interactive Macaronic Interfaces for Language Learning

no code implementations • ACL 2016 • Adithya Renduchintala, Rebecca Knowles, Philipp Koehn, Jason Eisner

Language Acquisition Machine Translation +1

Paper
Add Code

Analyzing Learner Understanding of Novel L2 Vocabulary

no code implementations • CONLL 2016 • Rebecca Knowles, Adithya Renduchintala, Philipp Koehn, Jason Eisner

Morphological Inflection

Paper
Add Code

Findings of the 2016 Conference on Machine Translation

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Findings of the 2015 Workshop on Statistical Machine Translation

no code implementations • WS 2015 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, Marco Turchi

Automatic Post-Editing Translation

Paper
Add Code

Edinburgh's Syntax-Based Systems at WMT 2015

no code implementations • WS 2015 • Philip Williams, Rico Sennrich, Maria Nadejde, Matthias Huck, Philipp Koehn

Ranked #79 on Machine Translation on WMT2014 English-German

Language Modelling Machine Translation +1

Paper
Add Code

The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT 2015

no code implementations • WS 2015 • Barry Haddow, Matthias Huck, Alex Birch, ra, Nikolay Bogoychev, Philipp Koehn

Language Modelling Machine Translation +1

Paper
Add Code

Results of the WMT15 Metrics Shared Task

no code implementations • WS 2015 • Milo{\v{s}} Stanojevi{\'c}, Amir Kamran, Philipp Koehn, Ond{\v{r}}ej Bojar

Machine Translation

Paper
Add Code

The Operation Sequence Model---Combining N-Gram-Based and Phrase-Based Statistical Machine Translation

no code implementations • CL 2015 • Nadir Durrani, Helmut Schmid, Alex Fraser, er, Philipp Koehn, Hinrich Sch{\"u}tze

Machine Translation Translation

Paper
Add Code

Preference Grammars and Soft Syntactic Constraints for GHKM Syntax-based Statistical Machine Translation

no code implementations • WS 2014 • Matthias Huck, Hieu Hoang, Philipp Koehn

Machine Translation Translation

Paper
Add Code

The MateCat Tool

no code implementations • COLING 2014 • Marcello Federico, Nicola Bertoldi, Mauro Cettolo, Matteo Negri, Marco Turchi, Marco Trombetti, Aless Cattelan, ro, Antonio Farina, Domenico Lupinetti, Andrea Martines, Alberto Massidda, Holger Schwenk, Lo{\"\i}c Barrault, Frederic Blain, Philipp Koehn, Christian Buck, Ulrich Germann

Machine Translation

Paper
Add Code

Investigating the Usefulness of Generalized Word Representations in SMT

no code implementations • COLING 2014 • Nadir Durrani, Philipp Koehn, Helmut Schmid, Alex Fraser, er

Machine Translation Morphological Analysis

Paper
Add Code

Edinburgh's Syntax-Based Systems at WMT 2014

no code implementations • WS 2014 • Philip Williams, Rico Sennrich, Maria Nadejde, Matthias Huck, Eva Hasler, Philipp Koehn

Transliteration

Paper
Add Code

EU-BRIDGE MT: Combined Machine Translation

no code implementations • WS 2014 • Markus Freitag, Stephan Peitz, Joern Wuebker, Hermann Ney, Matthias Huck, Rico Sennrich, Nadir Durrani, Maria Nadejde, Philip Williams, Philipp Koehn, Teresa Herrmann, Eunah Cho, Alex Waibel

Machine Translation Translation

Paper
Add Code

Edinburgh's Phrase-based Machine Translation Systems for WMT-14

no code implementations • WS 2014 • Nadir Durrani, Barry Haddow, Philipp Koehn, Kenneth Heafield

Language Modelling Machine Translation +1

Paper
Add Code

Dynamic Topic Adaptation for SMT using Distributional Profiles

no code implementations • WS 2014 • Eva Hasler, Barry Haddow, Philipp Koehn

Machine Translation Topic Models +1

Paper
Add Code

Augmenting String-to-Tree and Tree-to-String Translation with Non-Syntactic Phrases

no code implementations • WS 2014 • Matthias Huck, Hieu Hoang, Philipp Koehn

Machine Translation Translation +1

Paper
Add Code

Findings of the 2014 Workshop on Statistical Machine Translation

no code implementations • WS 2014 • Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, Aleš Tamchyna

Cross-Lingual Information Retrieval Domain Adaptation +5

Paper
Add Code

Refinements to Interactive Translation Prediction Based on Search Graphs

no code implementations • ACL 2014 • Philipp Koehn, Chara Tsoukala, Herve Saint-Amand

Machine Translation Sentence +2

Paper
Add Code

The Impact of Machine Translation Quality on Human Post-Editing

no code implementations • WS 2014 • Philipp Koehn, Ulrich Germann

Machine Translation Translation

Paper
Add Code

Using Feature Structures to Improve Verb Translation in English-to-German Statistical MT

no code implementations • WS 2014 • Philip Williams, Philipp Koehn

Machine Translation Translation

Paper
Add Code

Integrating an Unsupervised Transliteration Model into Statistical Machine Translation

no code implementations • EACL 2014 • Nadir Durrani, Hassan Sajjad, Hieu Hoang, Philipp Koehn

Language Modelling Translation +1

Paper
Add Code

CASMACAT: A Computer-assisted Translation Workbench

no code implementations • EACL 2014 • Vicent Alabau, Christian Buck, Michael Carl, Francisco Casacuberta, Mercedes García-Martínez, Ulrich Germann, Jesús González-Rubio, Robin Hill, Philipp Koehn, Luis Leiva, Bartolomé Mesa-Lao, Daniel Ortiz-Martínez, Herve Saint-Amand, Germán Sanchis Trilles, Chara Tsoukala

Machine Translation Translation

Paper
Add Code

Dynamic Topic Adaptation for Phrase-based MT

no code implementations • EACL 2014 • Eva Hasler, Phil Blunsom, Philipp Koehn, Barry Haddow

Domain Adaptation Language Modelling +1

Paper
Add Code

Learning to Prune: Context-Sensitive Pruning for Syntactic MT

no code implementations • ACL 2013 • Wenduan Xu, Yue Zhang, Philip Williams, Philipp Koehn

Language Modelling

Paper
Add Code

Dirt Cheap Web-Scale Parallel Text from the Common Crawl

no code implementations • ACL 2013 • Chris Callison-Burch, Adam Lopez, Philipp Koehn, Herve Saint-Amand, Jason R. Smith, Magdalena Plamada

Machine Translation Translation

Paper
Add Code

Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?

no code implementations • ACL 2013 • Nadir Durrani, Alex Fraser, er, Helmut Schmid, Hieu Hoang, Philipp Koehn

Language Modelling Translation

Paper
Add Code

Scalable Modified Kneser-Ney Language Model Estimation

no code implementations • ACL 2013 • Kenneth Heafield, Ivan Pouzyrevsky, Jonathan H. Clark, Philipp Koehn

Language Modelling Machine Translation

Paper
Add Code

Findings of the 2013 Workshop on Statistical Machine Translation

no code implementations • WS 2013 • Ond{\v{r}}ej Bojar, Christian Buck, Chris Callison-Burch, Christian Federmann, Barry Haddow, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia

Machine Translation Translation

Paper
Add Code

The Feasibility of HMEANT as a Human MT Evaluation Metric

no code implementations • WS 2013 • Alex Birch, ra, Barry Haddow, Ulrich Germann, Maria Nadejde, Christian Buck, Philipp Koehn

Machine Translation

Paper
Add Code

Abstract Meaning Representation for Sembanking

no code implementations • WS 2013 • Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, Nathan Schneider

Prepositional Phrase Attachment Text Generation

Paper
Add Code

Edinburgh's Machine Translation Systems for European Language Pairs

no code implementations • WS 2013 • Nadir Durrani, Barry Haddow, Kenneth Heafield, Philipp Koehn

Domain Adaptation Language Modelling +2

Paper
Add Code

Edinburgh's Syntax-Based Machine Translation Systems

no code implementations • WS 2013 • Maria Nadejde, Philip Williams, Philipp Koehn

Machine Translation Translation

Paper
Add Code

Grouping Language Model Boundary Words to Speed K--Best Extraction from Hypergraphs

no code implementations • NAACL 2013 • Kenneth Heafield, Philipp Koehn, Alon Lavie

Language Modelling Machine Translation +2

Paper
Add Code

Language Model Rest Costs and Space-Efficient Storage

no code implementations • EMNLP 2012 • Kenneth Heafield, Philipp Koehn, Alon Lavie

Language Modelling Machine Translation

Paper
Add Code

GHKM Rule Extraction and Scope-3 Parsing in Moses

no code implementations • WS 2012 • Philip Williams, Philipp Koehn

Machine Translation

Paper
Add Code

Towards Effective Use of Training Data in Statistical Machine Translation

no code implementations • WS 2012 • Philipp Koehn, Barry Haddow

Language Modelling Machine Translation +1

Paper
Add Code

Analysing the Effect of Out-of-Domain Data on SMT Systems

no code implementations • WS 2012 • Barry Haddow, Philipp Koehn

Domain Adaptation Language Modelling +1

Paper
Add Code

Findings of the 2012 Workshop on Statistical Machine Translation

no code implementations • WS 2012 • Chris Callison-Burch, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, Lucia Specia

Machine Translation Structured Prediction +1

Paper
Add Code

Europarl: A Parallel Corpus for Statistical Machine Translation

no code implementations • MTSummit 2005 • Philipp Koehn

We collected a corpus of parallel text in 11 languages from the proceedings of the European Parliament, which are published on the web.

Machine Translation Translation

Paper
Add Code

Statistical Phrase-Based Translation

no code implementations • HLT-NAACL 2003 • Philipp Koehn, Franz J. Och, Daniel Marcu

We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models.

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.