no code implementations • AMTA 2016 • Hamidreza Ghader, Christof Monz
Lexicalized and hierarchical reordering models use relative frequencies of fully lexicalized phrase pairs to learn phrase reordering distributions.
no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
no code implementations • 19 Feb 2024 • Baohao Liao, Christof Monz
With the growing size of large language models, the role of quantization becomes increasingly significant.
no code implementations • 7 Feb 2024 • Baohao Liao, Christof Monz
Memory-efficient finetuning of large language models (LLMs) has recently attracted huge attention with the increasing size of LLMs, primarily due to the constraints posed by GPU memory limitations and the comparable results of these methods with full finetuning.
no code implementations • 3 Feb 2024 • Sara Rajaee, Christof Monz
Recent advances in training multilingual language models on large datasets seem to have shown promising results in knowledge transfer across languages and achieve high performance on downstream tasks.
no code implementations • 1 Feb 2024 • Yan Meng, Christof Monz
In this paper, we conduct a large-scale study that varies the auxiliary target side languages along two dimensions, i. e., linguistic similarity and corpus size, to show the dynamic impact of knowledge transfer on the main language pairs.
1 code implementation • 22 Jan 2024 • Di wu, Shaomu Tan, Yan Meng, David Stap, Christof Monz
Zero-shot translation aims to translate between language pairs not seen during training in Multilingual Machine Translation (MMT) and is largely considered an open problem.
1 code implementation • 23 Oct 2023 • David Stap, Christof Monz
k-nearest-neighbor machine translation has demonstrated remarkable improvements in machine translation quality by creating a datastore of cached examples.
no code implementations • 20 Oct 2023 • Quinten Bolding, Baohao Liao, Brandon James Denis, Jun Luo, Christof Monz
Lastly, experiments on C-MTNT showcased its effectiveness in evaluating the robustness of NMT models, highlighting the potential of advanced language models for data cleaning and emphasizing C-MTNT as a valuable resource.
1 code implementation • 16 Oct 2023 • Shaomu Tan, Christof Monz
Our findings highlight that the target side translation quality is the most influential factor, with vocabulary overlap consistently impacting ZS performance.
no code implementations • 15 Oct 2023 • Di wu, Shaomu Tan, David Stap, Ali Araabi, Christof Monz
This paper describes the UvA-MT's submission to the WMT 2023 shared task on general machine translation.
no code implementations • 24 Jul 2023 • Ali Araabi, Vlad Niculae, Christof Monz
Despite the tremendous success of Neural Machine Translation (NMT), its performance on low-resource language pairs still remains subpar, partly due to the limited ability to handle previously unseen inputs, i. e., generalization.
1 code implementation • NeurIPS 2023 • Baohao Liao, Shaomu Tan, Christof Monz
One effective way to reduce the activation memory is to apply a reversible model, so the intermediate activations are not necessary to be cached and can be recomputed.
no code implementations • 26 May 2023 • Baohao Liao, Yan Meng, Christof Monz
Parameter-efficient fine-tuning (PEFT) of pre-trained language models has recently demonstrated remarkable achievements, effectively matching the performance of full fine-tuning while utilizing significantly fewer trainable parameters, and consequently addressing the storage and communication constraints.
1 code implementation • 23 May 2023 • Di wu, Christof Monz
Using a vocabulary that is shared across languages is common practice in Multilingual Neural Machine Translation (MNMT).
no code implementations • 19 May 2023 • David Stap, Vlad Niculae, Christof Monz
We argue that translation quality alone is not a sufficient metric for measuring knowledge transfer in multilingual neural machine translation.
1 code implementation • 9 Nov 2022 • Baohao Liao, David Thulke, Sanjika Hewavitharana, Hermann Ney, Christof Monz
We show: (1) [MASK]s can indeed be appended at a later layer, being disentangled from the word embedding; (2) The gathering of contextualized information from unmasked tokens can be conducted with a few layers.
no code implementations • AMTA 2022 • Ali Araabi, Christof Monz, Vlad Niculae
While it is often assumed that by using BPE, NMT systems are capable of handling OOV words, the effectiveness of BPE in translating OOV words has not been explicitly measured.
1 code implementation • EACL 2021 • Amir Soleimani, Christof Monz, Marcel Worring
We introduce NLQuAD, the first data set with baseline methods for non-factoid long question answering, a task requiring document-level language understanding.
no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri
In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.
no code implementations • COLING 2020 • Ali Araabi, Christof Monz
Language pairs with limited amounts of parallel data, also known as low-resource languages, remain a challenge for neural machine translation.
1 code implementation • WS 2020 • Marzieh Fadaee, Christof Monz
Recent works have shown that Neural Machine Translation (NMT) models achieve impressive performance, however, questions about understanding the behavior of these models remain unanswered.
1 code implementation • 29 Apr 2020 • Pengjie Ren, Zhumin Chen, Zhaochun Ren, Evangelos Kanoulas, Christof Monz, Maarten de Rijke
In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner.
1 code implementation • 26 Mar 2020 • Shaojie Jiang, Thomas Wolf, Christof Monz, Maarten de Rijke
We hypothesize that the deeper reason is that in the training corpora, there are hard tokens that are more difficult for a generative model to learn than others and, once learning has finished, hard tokens are still under-learned, so that repetitive generations are more likely to happen.
2 code implementations • 19 Nov 2019 • Jiahuan Pei, Pengjie Ren, Christof Monz, Maarten de Rijke
We propose a novel mixture-of-generators network (MoGNet) for DRG, where we assume that each token of a response is drawn from a mixture of distributions.
2 code implementations • 7 Oct 2019 • Amir Soleimani, Christof Monz, Marcel Worring
Motivated by the promising performance of pre-trained language models, we investigate BERT in an evidence retrieval and claim verification pipeline for the FEVER fact extraction and verification challenge.
1 code implementation • 26 Aug 2019 • Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke
Given a conversational context and background knowledge, we first learn a topic transition vector to encode the most likely text fragments to be used in the next response, which is then used to guide the local KS at each decoding timestamp.
1 code implementation • 18 Aug 2019 • Chuan Meng, Pengjie Ren, Zhumin Chen, Christof Monz, Jun Ma, Maarten de Rijke
In this paper, we propose a Reference-aware Network (RefNet) to address the two issues.
no code implementations • WS 2019 • Lo{\"\i}c Barrault, Ond{\v{r}}ej Bojar, Marta R. Costa-juss{\`a}, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias M{\"u}ller, Santanu Pal, Matt Post, Marcos Zampieri
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.
no code implementations • WS 2019 • Hamidreza Ghader, Christof Monz
We compare transformer and recurrent models in a more intrinsic way in terms of capturing lexical semantics and syntactic structures, in contrast to extrinsic approaches used by previous works.
2 code implementations • 25 Feb 2019 • Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten de Rijke
Specifically, we first analyze the influence of the commonly used Cross-Entropy (CE) loss function, and find that the CE loss function prefers high-frequency tokens, which results in low-diversity responses.
no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
no code implementations • WS 2018 • Ond{\v{r}}ej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, Christof Monz
This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018.
no code implementations • EMNLP 2018 • Marzieh Fadaee, Christof Monz
In this work, we explore different aspects of back-translation, and show that words with high prediction loss during training benefit most from the addition of synthetic data.
1 code implementation • EMNLP 2018 • Ke Tran, Arianna Bisazza, Christof Monz
Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016).
1 code implementation • LREC 2018 • Marzieh Fadaee, Arianna Bisazza, Christof Monz
Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs.
no code implementations • IJCNLP 2017 • Hamidreza Ghader, Christof Monz
Thus, the question still remains that how attention is similar or different from the traditional alignment.
no code implementations • WS 2017 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shu-Jian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, Marco Turchi
1 code implementation • EMNLP 2017 • Marlies van der Wees, Arianna Bisazza, Christof Monz
Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT).
1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora.
Data Augmentation Low-Resource Neural Machine Translation +2
1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz
Distributed word representations are widely used for modeling words in NLP tasks.
no code implementations • WS 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz
A major challenge for statistical machine translation (SMT) of Arabic-to-English user-generated text is the prevalence of text written in Arabizi, or Romanized Arabic.
no code implementations • COLING 2016 • Ekaterina Garmash, Christof Monz
We compare two methods of ensemble set induction: sampling parameter initializations for an NMT system, which is a relatively established method in NMT (Sutskever et al., 2014), and NMT systems translating from different source languages into the same target language, i. e., multi-source ensembles, a method recently introduced by Firat et al. (2016).
no code implementations • COLING 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz
Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.
no code implementations • 12 Oct 2016 • Hendrik Heuer, Christof Monz, Arnold W. M. Smeulders
This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions.
no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi
no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri
2 code implementations • NAACL 2016 • Ke Tran, Arianna Bisazza, Christof Monz
In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.
no code implementations • WS 2015 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, Marco Turchi
no code implementations • WS 2014 • Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, Aleš Tamchyna