no code implementations • EMNLP (IWSLT) 2019 • Joanna Wetesko, Marcin Chochowski, Pawel Przybysz, Philip Williams, Roman Grundkiewicz, Rico Sennrich, Barry Haddow, None Barone, Valerio Miceli, Alexandra Birch
This paper describes the joint submission to the IWSLT 2019 English to Czech task by Samsung RD Institute, Poland, and the University of Edinburgh.
no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.
no code implementations • WMT (EMNLP) 2020 • Ulrich Germann, Roman Grundkiewicz, Martin Popel, Radina Dobreva, Nikolay Bogoychev, Kenneth Heafield
We describe the joint submission of the University of Edinburgh and Charles University, Prague, to the Czech/English track in the WMT 2020 Shared Task on News Translation.
1 code implementation • WMT (EMNLP) 2021 • Maximiliana Behnke, Nikolay Bogoychev, Alham Fikri Aji, Kenneth Heafield, Graeme Nail, Qianqian Zhu, Svetlana Tchistiakova, Jelmer Van der Linde, Pinzhen Chen, Sidharth Kashyap, Roman Grundkiewicz
We participated in all tracks of the WMT 2021 efficient machine translation task: single-core CPU, multi-core CPU, and GPU hardware with throughput and latency conditions.
no code implementations • WMT (EMNLP) 2021 • Kenneth Heafield, Qianqian Zhu, Roman Grundkiewicz
The machine translation efficiency task challenges participants to make their systems faster and smaller with minimal impact on translation quality.
no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri
This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.
no code implementations • 7 Oct 2024 • Vikas Raunak, Roman Grundkiewicz, Marcin Junczys-Dowmunt
In this work, we introduce instruction finetuning for Neural Machine Translation (NMT) models, which distills instruction following capabilities from Large Language Models (LLMs) into orders-of-magnitude smaller NMT models.
no code implementations • 15 Aug 2024 • Thamme Gowda, Roman Grundkiewicz, Elijah Rippeth, Matt Post, Marcin Junczys-Dowmunt
We describe a Python interface to Marian NMT, a C++-based training and inference toolkit for sequence-to-sequence models, focusing on machine translation.
1 code implementation • 29 Jul 2024 • Tom Kocmi, Eleftherios Avramidis, Rachel Bawden, Ondrej Bojar, Anton Dvorkovich, Christian Federmann, Mark Fishel, Markus Freitag, Thamme Gowda, Roman Grundkiewicz, Barry Haddow, Marzena Karpinska, Philipp Koehn, Benjamin Marie, Kenton Murray, Masaaki Nagata, Martin Popel, Maja Popovic, Mariya Shmatova, Steinþór Steingrímsson, Vilém Zouhar
This is the preliminary ranking of WMT24 General MT systems based on automatic metrics.
1 code implementation • 17 Jun 2024 • Tom Kocmi, Vilém Zouhar, Eleftherios Avramidis, Roman Grundkiewicz, Marzena Karpinska, Maja Popović, Mrinmaya Sachan, Mariya Shmatova
High-quality Machine Translation (MT) evaluation relies heavily on human judgments.
1 code implementation • 14 Aug 2023 • Matt Post, Thamme Gowda, Roman Grundkiewicz, Huda Khayrallah, Rohit Jain, Marcin Junczys-Dowmunt
Many machine translation toolkits make use of a data preparation step wherein raw data is transformed into a tensor format that can be used directly by the trainer.
2 code implementations • WMT (EMNLP) 2021 • Tom Kocmi, Christian Federmann, Roman Grundkiewicz, Marcin Junczys-Dowmunt, Hitokazu Matsushita, Arul Menezes
Automatic metrics are commonly used as the exclusive tool for declaring the superiority of one machine translation system's quality over another.
no code implementations • EACL (HumEval) 2021 • Roman Grundkiewicz, Marcin Junczys-Dowmunt, Christian Federmann, Tom Kocmi
Recent studies emphasize the need of document context in human evaluation of machine translations, but little research has been done on the impact of user interfaces on annotator productivity and the reliability of assessments.
no code implementations • COLING 2020 • Roman Grundkiewicz, Christopher Bryant, Mariano Felice
Grammatical Error Correction (GEC) is the task of automatically detecting and correcting all types of errors in written text.
no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri
In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.
no code implementations • WS 2020 • Nikolay Bogoychev, Roman Grundkiewicz, Alham Fikri Aji, Maximiliana Behnke, Kenneth Heafield, Sidharth Kashyap, Emmanouil-Ioannis Farsarakis, Mateusz Chudyk
We participated in all tracks of the Workshop on Neural Generation and Translation 2020 Efficiency Shared Task: single-core CPU, multi-core CPU, and GPU.
no code implementations • WS 2019 • Roman Grundkiewicz, Marcin Junczys-Dowmunt
There has been an increased interest in low-resource approaches to automatic grammatical error correction.
no code implementations • WS 2019 • Young Jin Kim, Marcin Junczys-Dowmunt, Hany Hassan, Alham Fikri Aji, Kenneth Heafield, Roman Grundkiewicz, Nikolay Bogoychev
Taking our dominating submissions to the previous edition of the shared task as a starting point, we develop improved teacher-student training via multi-agent dual-learning and noisy backward-forward translation for Transformer-based student models.
1 code implementation • WS 2019 • Roman Grundkiewicz, Marcin Junczys-Dowmunt, Kenneth Heafield
Considerable effort has been made to address the data sparsity problem in neural grammatical error correction.
Ranked #16 on Grammatical Error Correction on BEA-2019 (test)
no code implementations • WS 2019 • Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, Alexandra Birch
For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data.
no code implementations • WS 2018 • Barry Haddow, Nikolay Bogoychev, Denis Emelin, Ulrich Germann, Roman Grundkiewicz, Kenneth Heafield, Antonio Valerio Miceli Barone, Rico Sennrich
The University of Edinburgh made submissions to all 14 language pairs in the news translation task, with strong performances in most pairs.
no code implementations • WS 2018 • Marcin Junczys-Dowmunt, Roman Grundkiewicz
This paper describes the Microsoft and University of Edinburgh submission to the Automatic Post-editing shared task at WMT2018.
1 code implementation • WS 2018 • Roman Grundkiewicz, Kenneth Heafield
Transliterating named entities from one language into another can be approached as neural machine translation (NMT) problem, for which we use deep attentional RNN encoder-decoder models.
no code implementations • WS 2018 • Marcin Junczys-Dowmunt, Kenneth Heafield, Hieu Hoang, Roman Grundkiewicz, Anthony Aue
This paper describes the submissions of the "Marian" team to the WNMT 2018 shared task.
1 code implementation • NAACL 2018 • Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield
Previously, neural methods in grammatical error correction (GEC) did not reach state-of-the-art results compared to phrase-based statistical machine translation (SMT) baselines.
Ranked #1 on Grammatical Error Correction on _Restricted_
no code implementations • NAACL 2018 • Roman Grundkiewicz, Marcin Junczys-Dowmunt
We combine two of the most popular approaches to automated Grammatical Error Correction (GEC): GEC based on Statistical Machine Translation (SMT) and GEC based on Neural Machine Translation (NMT).
3 code implementations • ACL 2018 • Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch
We present Marian, an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.
no code implementations • IJCNLP 2017 • Marcin Junczys-Dowmunt, Roman Grundkiewicz
In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output.
no code implementations • TACL 2017 • Andr{\'e} F. T. Martins, Marcin Junczys-Dowmunt, Fabio N. Kepler, Ram{\'o}n Astudillo, Chris Hokamp, Roman Grundkiewicz
Translation quality estimation is a task of growing importance in NLP, due to its potential to reduce post-editing human effort in disruptive ways.
no code implementations • EMNLP 2016 • Marcin Junczys-Dowmunt, Roman Grundkiewicz
In this work, we study parameter tuning towards the M^2 metric, the standard metric for automatic grammar error correction (GEC) tasks.
no code implementations • WS 2016 • Marcin Junczys-Dowmunt, Roman Grundkiewicz
This paper describes the submission of the AMU (Adam Mickiewicz University) team to the Automatic Post-Editing (APE) task of WMT 2016.