Search Results for author: Christof Monz

Found 69 papers, 21 papers with code

Which Words Matter in Defining Phrase Reordering Behavior in Statistical Machine Translation?

no code implementations • AMTA 2016 • Hamidreza Ghader, Christof Monz

Lexicalized and hierarchical reordering models use relative frequencies of fully lexicalized phrase pairs to learn phrase reordering distributions.

Language Modelling Machine Translation +1

Paper
Add Code

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

Paper
Add Code

Fine-Tuning for Neural Machine Translation with Limited Degradation across In- and Out-of-Domain Data

no code implementations • MTSummit 2017 • Praveen Dakwale, Christof Monz

Machine Translation

Paper
Add Code

Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation

no code implementations • 17 Apr 2024 • Shaomu Tan, Di wu, Christof Monz

Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference.

Cross-Lingual Transfer Machine Translation +2

Paper
Add Code

Is It a Free Lunch for Removing Outliers during Pretraining?

no code implementations • 19 Feb 2024 • Baohao Liao, Christof Monz

With the growing size of large language models, the role of quantization becomes increasingly significant.

Quantization

Paper
Add Code

ApiQ: Finetuning of 2-Bit Quantized Large Language Model

no code implementations • 7 Feb 2024 • Baohao Liao, Christof Monz

Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and the comparable results of these methods with full finetuning.

Language Modelling Large Language Model +1

Paper
Add Code

Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models

no code implementations • 3 Feb 2024 • Sara Rajaee, Christof Monz

Recent advances in training multilingual language models on large datasets seem to have shown promising results in knowledge transfer across languages and achieve high performance on downstream tasks.

Transfer Learning

Paper
Add Code

Disentangling the Roles of Target-Side Transfer and Regularization in Multilingual Machine Translation

no code implementations • 1 Feb 2024 • Yan Meng, Christof Monz

In this paper, we conduct a large-scale study that varies the auxiliary target side languages along two dimensions, i. e., linguistic similarity and corpus size, to show the dynamic impact of knowledge transfer on the main language pairs.

Machine Translation Transfer Learning +1

Paper
Add Code

How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

1 code implementation • 22 Jan 2024 • Di wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz

Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem.

Machine Translation Translation

Paper
Code

Multilingual k-Nearest-Neighbor Machine Translation

1 code implementation • 23 Oct 2023 • David Stap, Christof Monz

k-nearest-neighbor machine translation has demonstrated remarkable improvements in machine translation quality by creating a datastore of cached examples.

Machine Translation Translation

Paper
Code

Ask Language Model to Clean Your Noisy Translation Data

no code implementations • 20 Oct 2023 • Quinten Bolding, Baohao Liao, Brandon James Denis, Jun Luo, Christof Monz

Lastly, experiments on C-MTNT showcased its effectiveness in evaluating the robustness of NMT models, highlighting the potential of advanced language models for data cleaning and emphasizing C-MTNT as a valuable resource.

Language Modelling Machine Translation +2

Paper
Add Code

Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance

1 code implementation • 16 Oct 2023 • Shaomu Tan, Christof Monz

Our findings highlight that the target side translation quality is the most influential factor, with vocabulary overlap consistently impacting ZS performance.

Machine Translation NMT +1

Paper
Code

UvA-MT's Participation in the WMT23 General Translation Shared Task

no code implementations • 15 Oct 2023 • Di wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz

This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.

Machine Translation Translation

Paper
Add Code

Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation through Phrase Pair Variables

no code implementations • 24 Jul 2023 • Ali Araabi, Vlad Niculae, Christof Monz

Despite the tremendous success of Neural Machine Translation (NMT), its performance on low-resource language pairs still remains subpar, partly due to the limited ability to handle previously unseen inputs, i. e., generalization.

Low-Resource Neural Machine Translation NMT +1

Paper
Add Code

Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning

1 code implementation • NeurIPS 2023 • Baohao Liao, Shaomu Tan, Christof Monz

One effective way to reduce the activation memory is to apply a reversible model, so the intermediate activations are not necessary to be cached and can be recomputed.

Image Classification Question Answering

Paper
Code

Parameter-Efficient Fine-Tuning without Introducing New Latency

no code implementations • 26 May 2023 • Baohao Liao, Yan Meng, Christof Monz

Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently demonstrated remarkable achievements, effectively matching the performance of full fine-tuning while utilizing significantly fewer trainable parameters, and consequently addressing the storage and communication constraints.

Federated Learning

Paper
Add Code

Beyond Shared Vocabulary: Increasing Representational Word Similarities across Languages for Multilingual Machine Translation

1 code implementation • 23 May 2023 • Di wu, Christof Monz

Using a vocabulary that is shared across languages is common practice in Multilingual Neural Machine Translation (MNMT).

Machine Translation Transfer Learning +1

Paper
Code

Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens

no code implementations • 19 May 2023 • David Stap, Vlad Niculae, Christof Monz

We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation.

Machine Translation Transfer Learning +1

Paper
Add Code

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

1 code implementation • 9 Nov 2022 • Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz

We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers.

Paper
Code

How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?

no code implementations • AMTA 2022 • Ali Araabi, Christof Monz, Vlad Niculae

While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured.

Machine Translation NMT +1

Paper
Add Code

NLQuAD: A Non-Factoid Long Question Answering Data Set

1 code implementation • EACL 2021 • Amir Soleimani, Christof Monz, Marcel Worring

We introduce NLQuAD, the first data set with baseline methods for non-factoid long question answering, a task requiring document-level language understanding.

Descriptive Position +2

Paper
Code

Findings of the 2020 Conference on Machine Translation (WMT20)

no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri

In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.

Machine Translation Translation

Paper
Add Code

Optimizing Transformer for Low-Resource Neural Machine Translation

no code implementations • COLING 2020 • Ali Araabi, Christof Monz

Language pairs with limited amounts of parallel data, also known as low-resource languages, remain a challenge for neural machine translation.

Low-Resource Neural Machine Translation Translation

Paper
Add Code

The Unreasonable Volatility of Neural Machine Translation Models

1 code implementation • WS 2020 • Marzieh Fadaee, Christof Monz

Recent works have shown that Neural Machine Translation (NMT) models achieve impressive performance, however, questions about understanding the behavior of these models remain unanswered.

Machine Translation NMT +2

Paper
Code

Conversations with Search Engines: SERP-based Conversational Response Generation

1 code implementation • 29 Apr 2020 • Pengjie Ren, Zhumin Chen, Zhaochun Ren, Evangelos Kanoulas, Christof Monz, Maarten de Rijke

In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner.

Conversational Response Generation Conversational Search +1

Paper
Code

TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation

1 code implementation • 26 Mar 2020 • Shaojie Jiang, Thomas Wolf, Christof Monz, Maarten de Rijke

We hypothesize that the deeper reason is that in the training corpora, there are hard tokens that are more difficult for a generative model to learn than others and, once learning has finished, hard tokens are still under-learned, so that repetitive generations are more likely to happen.

Text Generation

Paper
Code

Retrospective and Prospective Mixture-of-Generators for Task-oriented Dialogue Response Generation

2 code implementations • 19 Nov 2019 • Jiahuan Pei, Pengjie Ren, Christof Monz, Maarten de Rijke

We propose a novel mixture-of-generators network (MoGNet) for DRG, where we assume that each token of a response is drawn from a mixture of distributions.

Response Generation Task-Oriented Dialogue Systems

819

Paper
Code

BERT for Evidence Retrieval and Claim Verification

2 code implementations • 7 Oct 2019 • Amir Soleimani, Christof Monz, Marcel Worring

Motivated by the promising performance of pre-trained language models, we investigate BERT in an evidence retrieval and claim verification pipeline for the FEVER fact extraction and verification challenge.

Claim Verification Retrieval

171

Paper
Code

Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation

1 code implementation • 26 Aug 2019 • Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke

Given a conversational context and background knowledge, we first learn a topic transition vector to encode the most likely text fragments to be used in the next response, which is then used to guide the local KS at each decoding timestamp.

Paper
Code

RefNet: A Reference-aware Network for Background Based Conversation

1 code implementation • 18 Aug 2019 • Chuan Meng, Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke

In this paper, we propose a Reference-aware Network (RefNet) to address the two issues.

Paper
Code

Findings of the 2019 Conference on Machine Translation (WMT19)

no code implementations • WS 2019 • Lo{\"\i}c Barrault, Ond{\v{r}}ej Bojar, Marta R. Costa-juss{\`a}, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias M{\"u}ller, Santanu Pal, Matt Post, Marcos Zampieri

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.

Machine Translation Translation

Paper
Add Code

Improving Neural Machine Translation Using Noisy Parallel Data through Distillation

no code implementations • WS 2019 • Praveen Dakwale, Christof Monz

Machine Translation Translation

Paper
Add Code

An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation Architectures

no code implementations • WS 2019 • Hamidreza Ghader, Christof Monz

We compare transformer and recurrent models in a more intrinsic way in terms of capturing lexical semantics and syntactic structures, in contrast to extrinsic approaches used by previous works.

Machine Translation Translation +1

Paper
Add Code

Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss

2 code implementations • 25 Feb 2019 • Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke

Specifically, we first analyze the influence of the commonly used Cross-Entropy (CE) loss function, and find that the CE loss function prefers high-frequency tokens, which results in low-diversity responses.

Response Generation

Paper
Code

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor

Machine Translation Translation

Paper
Add Code

Findings of the 2018 Conference on Machine Translation (WMT18)

no code implementations • WS 2018 • Ond{\v{r}}ej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, Christof Monz

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018.

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

no code implementations • EMNLP 2018 • Marzieh Fadaee, Christof Monz

In this work, we explore different aspects of back-translation, and show that words with high prediction loss during training benefit most from the addition of synthetic data.

Machine Translation Translation

Paper
Add Code

Evaluation of Machine Translation Performance Across Multiple Genres and Languages

no code implementations • LREC 2018 • Marlies van der Wees, Arianna Bisazza, Christof Monz

Machine Translation Translation

Paper
Add Code

The Importance of Being Recurrent for Modeling Hierarchical Structure

1 code implementation • EMNLP 2018 • Ke Tran, Arianna Bisazza, Christof Monz

Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016).

Language Modelling Machine Translation +1

Paper
Code

Examining the Tip of the Iceberg: A Data Set for Idiom Translation

1 code implementation • LREC 2018 • Marzieh Fadaee, Arianna Bisazza, Christof Monz

Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs.

Machine Translation NMT +1

Paper
Code

What does Attention in Neural Machine Translation Pay Attention to?

no code implementations • IJCNLP 2017 • Hamidreza Ghader, Christof Monz

Thus, the question still remains that how attention is similar or different from the traditional alignment.

Machine Translation Sentence +1

Paper
Add Code

Findings of the 2017 Conference on Machine Translation (WMT17)

no code implementations • WS 2017 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shu-Jian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, Marco Turchi

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Dynamic Data Selection for Neural Machine Translation

1 code implementation • EMNLP 2017 • Marlies van der Wees, Arianna Bisazza, Christof Monz

Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT).

Machine Translation NMT +1

Paper
Code

Data Augmentation for Low-Resource Neural Machine Translation

1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz

The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora.

Data Augmentation Low-Resource Neural Machine Translation +2

Paper
Code

Learning Topic-Sensitive Word Representations

1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz

Distributed word representations are widely used for modeling words in NLP tasks.

Paper
Code

Measuring the Effect of Conversational Aspects on Machine Translation Quality

no code implementations • COLING 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz

Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.

Machine Translation Translation

Paper
Add Code

Ensemble Learning for Multi-Source Neural Machine Translation

no code implementations • COLING 2016 • Ekaterina Garmash, Christof Monz

We compare two methods of ensemble set induction: sampling parameter initializations for an NMT system, which is a relatively established method in NMT (Sutskever et al., 2014), and NMT systems translating from different source languages into the same target language, i. e., multi-source ensembles, a method recently introduced by Firat et al. (2016).

Ensemble Learning Machine Translation +2

Paper
Add Code

A Simple but Effective Approach to Improve Arabizi-to-English Statistical Machine Translation

no code implementations • WS 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz

A major challenge for statistical machine translation (SMT) of Arabic-to-English user-generated text is the prevalence of text written in Arabizi, or Romanized Arabic.

Translation Transliteration

Paper
Add Code

Generating captions without looking beyond objects

no code implementations • 12 Oct 2016 • Hendrik Heuer, Christof Monz, Arnold W. M. Smeulders

This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions.

Caption Generation Image Captioning +2

Paper
Add Code

Findings of the 2016 Conference on Machine Translation

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi

Machine Translation Translation

Paper
Add Code

Improving Statistical Machine Translation Performance by Oracle-BLEU Model Re-estimation

no code implementations • ACL 2016 • Praveen Dakwale, Christof Monz

Machine Translation Translation

Paper
Add Code

Recurrent Memory Networks for Language Modeling

2 code implementations • NAACL 2016 • Ke Tran, Arianna Bisazza, Christof Monz

In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.

Language Modelling Sentence +1