Search Results for author: Martin Popel

Found 43 papers, 5 papers with code

TectoMT – a deep linguistic core of the combined Cimera MT system

no code implementations • EAMT 2016 • Martin Popel, Roman Sudarikov, Ondřej Bojar, Rudolf Rosa, Jan Hajič

Paper
Add Code

Detecting Post-Edited References and Their Effect on Human Evaluation

no code implementations • EACL (HumEval) 2021 • Věra Kloudová, Ondřej Bojar, Martin Popel

This paper provides a quick overview of possible methods how to detect that reference translations were actually created by post-editing an MT system.

Paper
Add Code

Is one head enough? Mention heads in coreference annotations compared with UD-style heads

no code implementations • DepLing (SyntaxFest) 2021 • Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, Daniel Zeman

Paper
Add Code

CUNI English-Czech and English-Polish Systems in WMT20: Robust Document-Level Training

no code implementations • WMT (EMNLP) 2020 • Martin Popel

We describe our two NMT systems submitted to the WMT 2020 shared task in English<->Czech and English<->Polish news translation.

NMT Sentence +1

Paper
Add Code

Speed-optimized, Compact Student Models that Distill Knowledge from a Larger Teacher Model: the UEDIN-CUNI Submission to the WMT 2020 News Translation Task

no code implementations • WMT (EMNLP) 2020 • Ulrich Germann, Roman Grundkiewicz, Martin Popel, Radina Dobreva, Nikolay Bogoychev, Kenneth Heafield

We describe the joint submission of the University of Edinburgh and Charles University, Prague, to the Czech/English track in the WMT 2020 Shared Task on News Translation.

Translation

Paper
Add Code

Domain Adaptation of Document-Level NMT in IWSLT19

no code implementations • EMNLP (IWSLT) 2019 • Martin Popel, Christian Federmann

We describe our four NMT systems submitted to the IWSLT19 shared task in English→Czech text-to-text translation of TED talks.

Domain Adaptation NMT +2

Paper
Add Code

CorefUD 1.0: Coreference Meets Universal Dependencies

no code implementations • LREC 2022 • Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, Amir Zeldes, Daniel Zeman

Recent advances in standardization for annotated language resources have led to successful large scale efforts, such as the Universal Dependencies (UD) project for multilingual syntactically annotated data.

coreference-resolution named-entity-recognition +2

Paper
Add Code

CUNI Systems in WMT21: Revisiting Backtranslation Techniques for English-Czech NMT

no code implementations • WMT (EMNLP) 2021 • Petr Gebauer, Ondřej Bojar, Vojtěch Švandelík, Martin Popel

We use the latter for experiments with various backtranslation techniques.

NMT Segmentation +3

Paper
Add Code

Do UD Trees Match Mention Spans in Coreference Annotations?

no code implementations • Findings (EMNLP) 2021 • Martin Popel, Zdeněk Žabokrtský, Anna Nedoluzhko, Michal Novák, Daniel Zeman

One can find dozens of data resources for various languages in which coreference - a relation between two or more expressions that refer to the same real-world entity - is manually annotated.

Paper
Add Code

Charles Translator: A Machine Translation System between Ukrainian and Czech

no code implementations • 10 Apr 2024 • Martin Popel, Lucie Poláková, Michal Novák, Jindřich Helcl, Jindřich Libovický, Pavel Straňák, Tomáš Krabač, Jaroslava Hlaváčová, Mariia Anisimova, Tereza Chlaňová

We present Charles Translator, a machine translation system between Ukrainian and Czech, developed as part of a society-wide effort to mitigate the impact of the Russian-Ukrainian war on individuals and society.

Translation Transliteration

Paper
Add Code

Evaluating Optimal Reference Translations

1 code implementation • 28 Nov 2023 • Vilém Zouhar, Věra Kloudová, Martin Popel, Ondřej Bojar

The overall translation quality reached by current machine translation (MT) systems for high-resourced language pairs is remarkably good.

Machine Translation Translation

Paper
Code

LongEval-Retrieval: French-English Dynamic Test Collection for Continuous Web Search Evaluation

no code implementations • 6 Mar 2023 • Petra Galuščáková Romain Deveaud, Gabriela Gonzalez-Saez, Philippe Mulhem, Lorraine Goeuriot, Florina Piroi, Martin Popel

LongEval-Retrieval is a Web document retrieval benchmark that focuses on continuous retrieval evaluation.

Information Retrieval Privacy Preserving +1

Paper
Add Code

CUNI Systems for the WMT22 Czech-Ukrainian Translation Task

no code implementations • 1 Dec 2022 • Martin Popel, Jindřich Libovický, Jindřich Helcl

We present Charles University submissions to the WMT22 General Translation Shared Task on Czech-Ukrainian and Ukrainian-Czech machine translation.

Machine Translation Translation

Paper
Add Code

CUNI Submission in WMT22 General Task

no code implementations • 29 Nov 2022 • Josef Jon, Martin Popel, Ondřej Bojar

We evaluate performance of MBR decoding compared to traditional mixed backtranslation training and we show a possible synergy when using both of the techniques simultaneously.

Translation

Paper
Add Code

Findings of the Shared Task on Multilingual Coreference Resolution

1 code implementation • CRAC (ACL) 2022 • Zdeněk Žabokrtský, Miloslav Konopík, Anna Nedoluzhko, Michal Novák, Maciej Ogrodniczuk, Martin Popel, Ondřej Pražák, Jakub Sido, Daniel Zeman, YIlun Zhu

The public edition of CorefUD 1. 0, which contains 13 datasets for 10 languages, was used as the source of training and evaluation data.

Clustering coreference-resolution

Paper
Code

Understanding Model Robustness to User-generated Noisy Texts

1 code implementation • WNUT (ACL) 2021 • Jakub Náplava, Martin Popel, Milan Straka, Jana Straková

We also compare two approaches to address the performance drop: a) training the NLP models with noised data generated by our framework; and b) reducing the input noise with external system for natural language correction.

Grammatical Error Correction Machine Translation +5

Paper
Code

Neural Machine Translation Quality and Post-Editing Performance

1 code implementation • EMNLP 2021 • Vilém Zouhar, Aleš Tamchyna, Martin Popel, Ondřej Bojar

We test the natural expectation that using MT in professional translation saves human processing time.

Machine Translation NMT +1

Paper
Code

Announcing CzEng 2.0 Parallel Corpus with over 2 Gigawords

no code implementations • 6 Jul 2020 • Tom Kocmi, Martin Popel, Ondrej Bojar

We present a new release of the Czech-English parallel corpus CzEng 2. 0 consisting of over 2 billion words (2 "gigawords") in each language.

Paper
Add Code

English-Czech Systems in WMT19: Document-Level Transformer

no code implementations • WS 2019 • Martin Popel, Dominik Macháček, Michal Auersperger, Ondřej Bojar, Pavel Pecina

We describe our NMT systems submitted to the WMT19 shared task in English-Czech news translation.

NMT Sentence +1

Paper
Add Code

CUNI System for the WMT19 Robustness Task

no code implementations • WS 2019 • Jindřich Helcl, Jindřich Libovický, Martin Popel

We present our submission to the WMT19 Robustness Task.

Translation

Paper
Add Code

CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

no code implementations • CONLL 2018 • Daniel Zeman, Jan Haji{\v{c}}, Martin Popel, Martin Potthast, Milan Straka, Filip Ginter, Joakim Nivre, Slav Petrov

Every year, the Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets.

Dependency Parsing Morphological Analysis +1

Paper
Add Code

CUNI Transformer Neural MT System for WMT18

no code implementations • WS 2018 • Martin Popel

We apply a simple but effective filtering of the synthetic data.

coreference-resolution Machine Translation +2

Paper
Add Code

Training Tips for the Transformer Model

4 code implementations • 1 Apr 2018 • Martin Popel, Ondřej Bojar

This article describes our experiments in neural machine translation using the recent Tensor2Tensor framework and the Transformer sequence-to-sequence model (Vaswani et al., 2017).

Machine Translation Sentence +1

14,859

Paper
Code

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

no code implementations • CONLL 2017 • Daniel Zeman, Martin Popel, Milan Straka, Jan Haji{\v{c}}, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast, Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie Cinkov{\'a}, Jan Haji{\v{c}} jr., Jaroslava Hlav{\'a}{\v{c}}ov{\'a}, V{\'a}clava Kettnerov{\'a}, Zde{\v{n}}ka Ure{\v{s}}ov{\'a}, Jenna Kanerva, Stina Ojala, Anna Missil{\"a}, Christopher D. Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi Kanayama, Valeria de Paiva, Kira Droganova, H{\'e}ctor Mart{\'\i}nez Alonso, {\c{C}}a{\u{g}}r{\i} {\c{C}}{\"o}ltekin, Umut Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia, Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, M, Michael l, Jesse Kirchner, Hector Fern Alcalde, ez, Jana Strnadov{\'a}, Esha Banerjee, Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo Mendon{\c{c}}a, L, Tatiana o, Rattima Nitisaroj, Josie Li

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets.

Dependency Parsing

Paper
Add Code

Udapi: Universal API for Universal Dependencies

no code implementations • WS 2017 • Martin Popel, Zden{\v{e}}k {\v{Z}}abokrtsk{\'y}, Martin Vojtek

Dependency Parsing

Paper
Add Code

Moses \& Treex Hybrid MT Systems Bestiary

no code implementations • WS 2016 • Rudolf Rosa, Martin Popel, Ond{\v{r}}ej Bojar, David Mare{\v{c}}ek, Ond{\v{r}}ej Du{\v{s}}ek

Language Modelling Machine Translation +1

Paper
Add Code

Dictionary-based Domain Adaptation of MT Systems without Retraining

no code implementations • WS 2016 • Rudolf Rosa, Roman Sudarikov, Michal Nov{\'a}k, Martin Popel, Ond{\v{r}}ej Bojar

Domain Adaptation Machine Translation

Paper
Add Code

SMT and Hybrid systems of the QTLeap project in the WMT16 IT-task

no code implementations • WS 2016 • Rosa Gaudio, Gorka Labaka, Eneko Agirre, Petya Osenova, Kiril Simov, Martin Popel, Dieke Oele, Gertjan van Noord, Lu{\'\i}s Gomes, Jo{\~a}o Ant{\'o}nio Rodrigues, Steven Neale, Jo{\~a}o Silva, Andreia Querido, Nuno Rendeiro, Ant{\'o}nio Branco

Machine Translation

Paper
Add Code

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi

Machine Translation Translation

Paper
Add Code

Findings of the 2016 Conference on Machine Translation

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Tools and Guidelines for Principled Machine Translation Development

no code implementations • LREC 2016 • Nora Aranberri, Eleftherios Avramidis, Aljoscha Burchardt, Ond{\v{r}}ej Klejch, Martin Popel, Maja Popovi{\'c}

This work addresses the need to aid Machine Translation (MT) development cycles with a complete workflow of MT evaluation methods.

Machine Translation Translation

Paper
Add Code

QTLeap WSD/NED Corpora: Semantic Annotation of Parallel Corpora in Six Languages

no code implementations • LREC 2016 • Arantxa Otegi, Nora Aranberri, Antonio Branco, Jan Haji{\v{c}}, Martin Popel, Kiril Simov, Eneko Agirre, Petya Osenova, Rita Pereira, Jo{\~a}o Silva, Steven Neale

This work presents parallel corpora automatically annotated with several NLP tools, including lemma and part-of-speech tagging, named-entity recognition and classification, named-entity disambiguation, word-sense disambiguation, and coreference.

Cross-Lingual Transfer Entity Disambiguation +9