Search Results for author: Christof Monz

Found 68 papers, 21 papers with code

Which Words Matter in Defining Phrase Reordering Behavior in Statistical Machine Translation?

no code implementations AMTA 2016 Hamidreza Ghader, Christof Monz

Lexicalized and hierarchical reordering models use relative frequencies of fully lexicalized phrase pairs to learn phrase reordering distributions.

Language Modelling Machine Translation +1

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations WMT (EMNLP) 2021 Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

Is It a Free Lunch for Removing Outliers during Pretraining?

no code implementations19 Feb 2024 Baohao Liao, Christof Monz

With the growing size of large language models, the role of quantization becomes increasingly significant.

Quantization

ApiQ: Finetuning of 2-Bit Quantized Large Language Model

no code implementations7 Feb 2024 Baohao Liao, Christof Monz

Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and the comparable results of these methods with full finetuning.

Language Modelling Large Language Model +1

Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models

no code implementations3 Feb 2024 Sara Rajaee, Christof Monz

Recent advances in training multilingual language models on large datasets seem to have shown promising results in knowledge transfer across languages and achieve high performance on downstream tasks.

Transfer Learning

Disentangling the Roles of Target-Side Transfer and Regularization in Multilingual Machine Translation

no code implementations1 Feb 2024 Yan Meng, Christof Monz

In this paper, we conduct a large-scale study that varies the auxiliary target side languages along two dimensions, i. e., linguistic similarity and corpus size, to show the dynamic impact of knowledge transfer on the main language pairs.

Machine Translation Transfer Learning +1

How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual Translation via Tiny Multi-Parallel Data

1 code implementation22 Jan 2024 Di wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz

Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem.

Machine Translation Translation

Multilingual k-Nearest-Neighbor Machine Translation

1 code implementation23 Oct 2023 David Stap, Christof Monz

k-nearest-neighbor machine translation has demonstrated remarkable improvements in machine translation quality by creating a datastore of cached examples.

Machine Translation Translation

Ask Language Model to Clean Your Noisy Translation Data

no code implementations20 Oct 2023 Quinten Bolding, Baohao Liao, Brandon James Denis, Jun Luo, Christof Monz

Lastly, experiments on C-MTNT showcased its effectiveness in evaluating the robustness of NMT models, highlighting the potential of advanced language models for data cleaning and emphasizing C-MTNT as a valuable resource.

Language Modelling Machine Translation +2

Towards a Better Understanding of Variations in Zero-Shot Neural Machine Translation Performance

1 code implementation16 Oct 2023 Shaomu Tan, Christof Monz

Our findings highlight that the target side translation quality is the most influential factor, with vocabulary overlap consistently impacting ZS performance.

Machine Translation NMT +1

UvA-MT's Participation in the WMT23 General Translation Shared Task

no code implementations15 Oct 2023 Di wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz

This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.

Machine Translation Translation

Joint Dropout: Improving Generalizability in Low-Resource Neural Machine Translation through Phrase Pair Variables

no code implementations24 Jul 2023 Ali Araabi, Vlad Niculae, Christof Monz

Despite the tremendous success of Neural Machine Translation (NMT), its performance on low-resource language pairs still remains subpar, partly due to the limited ability to handle previously unseen inputs, i. e., generalization.

Low-Resource Neural Machine Translation NMT +1

Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning

1 code implementation NeurIPS 2023 Baohao Liao, Shaomu Tan, Christof Monz

One effective way to reduce the activation memory is to apply a reversible model, so the intermediate activations are not necessary to be cached and can be recomputed.

Image Classification Question Answering

Parameter-Efficient Fine-Tuning without Introducing New Latency

no code implementations26 May 2023 Baohao Liao, Yan Meng, Christof Monz

Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently demonstrated remarkable achievements, effectively matching the performance of full fine-tuning while utilizing significantly fewer trainable parameters, and consequently addressing the storage and communication constraints.

Federated Learning

Viewing Knowledge Transfer in Multilingual Machine Translation Through a Representational Lens

no code implementations19 May 2023 David Stap, Vlad Niculae, Christof Monz

We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation.

Machine Translation Transfer Learning +1

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token

1 code implementation9 Nov 2022 Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz

We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers.

How Effective is Byte Pair Encoding for Out-Of-Vocabulary Words in Neural Machine Translation?

no code implementations AMTA 2022 Ali Araabi, Christof Monz, Vlad Niculae

While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured.

Machine Translation NMT +1

NLQuAD: A Non-Factoid Long Question Answering Data Set

1 code implementation EACL 2021 Amir Soleimani, Christof Monz, Marcel Worring

We introduce NLQuAD, the first data set with baseline methods for non-factoid long question answering, a task requiring document-level language understanding.

Descriptive Position +2

Optimizing Transformer for Low-Resource Neural Machine Translation

no code implementations COLING 2020 Ali Araabi, Christof Monz

Language pairs with limited amounts of parallel data, also known as low-resource languages, remain a challenge for neural machine translation.

Low-Resource Neural Machine Translation Translation

The Unreasonable Volatility of Neural Machine Translation Models

1 code implementation WS 2020 Marzieh Fadaee, Christof Monz

Recent works have shown that Neural Machine Translation (NMT) models achieve impressive performance, however, questions about understanding the behavior of these models remain unanswered.

Machine Translation NMT +2

Conversations with Search Engines: SERP-based Conversational Response Generation

1 code implementation29 Apr 2020 Pengjie Ren, Zhumin Chen, Zhaochun Ren, Evangelos Kanoulas, Christof Monz, Maarten de Rijke

In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner.

Conversational Response Generation Conversational Search +1

TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation

1 code implementation26 Mar 2020 Shaojie Jiang, Thomas Wolf, Christof Monz, Maarten de Rijke

We hypothesize that the deeper reason is that in the training corpora, there are hard tokens that are more difficult for a generative model to learn than others and, once learning has finished, hard tokens are still under-learned, so that repetitive generations are more likely to happen.

Text Generation

Retrospective and Prospective Mixture-of-Generators for Task-oriented Dialogue Response Generation

2 code implementations19 Nov 2019 Jiahuan Pei, Pengjie Ren, Christof Monz, Maarten de Rijke

We propose a novel mixture-of-generators network (MoGNet) for DRG, where we assume that each token of a response is drawn from a mixture of distributions.

Response Generation Task-Oriented Dialogue Systems

BERT for Evidence Retrieval and Claim Verification

2 code implementations7 Oct 2019 Amir Soleimani, Christof Monz, Marcel Worring

Motivated by the promising performance of pre-trained language models, we investigate BERT in an evidence retrieval and claim verification pipeline for the FEVER fact extraction and verification challenge.

Claim Verification Retrieval

Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation

1 code implementation26 Aug 2019 Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke

Given a conversational context and background knowledge, we first learn a topic transition vector to encode the most likely text fragments to be used in the next response, which is then used to guide the local KS at each decoding timestamp.

RefNet: A Reference-aware Network for Background Based Conversation

1 code implementation18 Aug 2019 Chuan Meng, Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke

In this paper, we propose a Reference-aware Network (RefNet) to address the two issues.

An Intrinsic Nearest Neighbor Analysis of Neural Machine Translation Architectures

no code implementations WS 2019 Hamidreza Ghader, Christof Monz

We compare transformer and recurrent models in a more intrinsic way in terms of capturing lexical semantics and syntactic structures, in contrast to extrinsic approaches used by previous works.

Machine Translation Translation +1

Improving Neural Response Diversity with Frequency-Aware Cross-Entropy Loss

2 code implementations25 Feb 2019 Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke

Specifically, we first analyze the influence of the commonly used Cross-Entropy (CE) loss function, and find that the CE loss function prefers high-frequency tokens, which results in low-diversity responses.

Response Generation

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

no code implementations EMNLP 2018 Marzieh Fadaee, Christof Monz

In this work, we explore different aspects of back-translation, and show that words with high prediction loss during training benefit most from the addition of synthetic data.

Machine Translation Translation

The Importance of Being Recurrent for Modeling Hierarchical Structure

1 code implementation EMNLP 2018 Ke Tran, Arianna Bisazza, Christof Monz

Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016).

Language Modelling Machine Translation +1

Examining the Tip of the Iceberg: A Data Set for Idiom Translation

1 code implementation LREC 2018 Marzieh Fadaee, Arianna Bisazza, Christof Monz

Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs.

Machine Translation NMT +1

Dynamic Data Selection for Neural Machine Translation

1 code implementation EMNLP 2017 Marlies van der Wees, Arianna Bisazza, Christof Monz

Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT).

Machine Translation NMT +1

Learning Topic-Sensitive Word Representations

1 code implementation ACL 2017 Marzieh Fadaee, Arianna Bisazza, Christof Monz

Distributed word representations are widely used for modeling words in NLP tasks.

A Simple but Effective Approach to Improve Arabizi-to-English Statistical Machine Translation

no code implementations WS 2016 Marlies van der Wees, Arianna Bisazza, Christof Monz

A major challenge for statistical machine translation (SMT) of Arabic-to-English user-generated text is the prevalence of text written in Arabizi, or Romanized Arabic.

Translation Transliteration

Ensemble Learning for Multi-Source Neural Machine Translation

no code implementations COLING 2016 Ekaterina Garmash, Christof Monz

We compare two methods of ensemble set induction: sampling parameter initializations for an NMT system, which is a relatively established method in NMT (Sutskever et al., 2014), and NMT systems translating from different source languages into the same target language, i. e., multi-source ensembles, a method recently introduced by Firat et al. (2016).

Ensemble Learning Machine Translation +2

Measuring the Effect of Conversational Aspects on Machine Translation Quality

no code implementations COLING 2016 Marlies van der Wees, Arianna Bisazza, Christof Monz

Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.

Machine Translation Translation

Generating captions without looking beyond objects

no code implementations12 Oct 2016 Hendrik Heuer, Christof Monz, Arnold W. M. Smeulders

This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions.

Image Captioning Language Modelling +1

Recurrent Memory Networks for Language Modeling

2 code implementations NAACL 2016 Ke Tran, Arianna Bisazza, Christof Monz

In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.

Language Modelling Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.