no code implementations • IWSLT 2017 • Mauro Cettolo, Marcello Federico, Luisa Bentivogli, Jan Niehues, Sebastian Stüker, Katsuhito Sudoh, Koichiro Yoshino, Christian Federmann
The IWSLT 2017 evaluation campaign has organised three tasks.
no code implementations • IWSLT 2016 • Mauro Cettolo, Jan Niehues, Sebastian Stüker, Luisa Bentivogli, Rolando Cattoni, Marcello Federico
The IWSLT 2016 Evaluation Campaign featured two tasks: the translation of talks and the translation of video conference conversations.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
1 code implementation • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi
Subtitling tools are recently being adapted for post-editing by providing automatically generated subtitles, and featuring not only machine translation, but also automatic segmentation and synchronisation.
no code implementations • IWSLT (EMNLP) 2018 • Jan Niehues, Rolando Cattoni, Sebastian Stüker, Mauro Cettolo, Marco Turchi, Marcello Federico
The International Workshop of Spoken Language Translation (IWSLT) 2018 Evaluation Campaign featured two tasks: low-resource machine translation and speech translation.
no code implementations • EAMT 2022 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Matteo Negri, Marco Turchi
This project aimed at extending the test sets of the MuST-C speech translation (ST) corpus with new reference translations.
no code implementations • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi
In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles.
no code implementations • IWSLT (EMNLP) 2018 • Luisa Bentivogli, Mauro Cettolo, Marcello Federico, Christian Federmann
In this paper we present an analysis of the two most prominent methodologies used for the human evaluation of MT quality, namely evaluation based on Post-Editing (PE) and evaluation based on Direct Assessment (DA).
no code implementations • 7 Nov 2024 • Ibrahim Said Ahmad, Antonios Anastasopoulos, Ondřej Bojar, Claudia Borg, Marine Carpuat, Roldano Cattoni, Mauro Cettolo, William Chen, Qianqian Dong, Marcello Federico, Barry Haddow, Dávid Javorský, Mateusz Krubiński, Tsz Kin Lam, Xutai Ma, Prashant Mathur, Evgeny Matusov, Chandresh Maurya, John McCrae, Kenton Murray, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, Atul Kr. Ojha, John Ortega, Sara Papi, Peter Polák, Adam Pospíšil, Pavel Pecina, Elizabeth Salesky, Nivedita Sethiya, Balaram Sarkar, Jiatong Shi, Claytone Sikasote, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Brian Thompson, Marco Turchi, Alex Waibel, Shinji Watanabe, Patrick Wilken, Petr Zemánek, Rodolfo Zevallos
This paper reports on the shared tasks organized by the 21st IWSLT Conference.
1 code implementation • 3 Nov 2024 • Dennis Fucci, Marco Gaido, Beatrice Savoldi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli
Spurred by the demand for interpretable models, research on eXplainable AI for language technologies has experienced significant growth, with feature attribution methods emerging as a cornerstone of this progress.
1 code implementation • 1 Oct 2024 • Marco Gaido, Sara Papi, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih, Matteo Negri
The rise of foundation models (FMs), coupled with regulatory efforts addressing their risks and impacts, has sparked significant interest in open-source models.
2 code implementations • 17 May 2024 • Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli
Subtitling plays a crucial role in enhancing the accessibility of audiovisual content and encompasses three primary subtasks: translating spoken dialogue, segmenting translations into concise textual units, and estimating timestamps that govern their on-screen duration.
1 code implementation • 24 Oct 2023 • Dennis Fucci, Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli
When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits.
1 code implementation • 10 Oct 2023 • Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli
Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 27 Sep 2022 • Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi
Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.
1 code implementation • LREC 2022 • Alina Karakanta, François Buet, Mauro Cettolo, François Yvon
Subtitle segmentation can be evaluated with sequence segmentation metrics against a human reference.
no code implementations • ACL 2021 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi
Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.
no code implementations • ICNLSP 2021 • Marco Gaido, Matteo Negri, Mauro Cettolo, Marco Turchi
The audio segmentation mismatch between training data and those seen at run-time is a major problem in direct speech translation.
1 code implementation • EACL 2021 • Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi
Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST).
1 code implementation • 5 Aug 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi
We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.
no code implementations • WS 2016 • Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, Andrei Popescu-Belis
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction.
no code implementations • COLING 2018 • Surafel M. Lakew, Mauro Cettolo, Marcello Federico
Motivated by this, our work (i) provides a quantitative and comparative analysis of the translations produced by bilingual, multilingual and zero-shot systems; (ii) investigates the translation quality of two of the currently dominant neural architectures in MT, which are the Recurrent and the Transformer ones; and (iii) quantitatively explores how the closeness between languages influences the zero-shot translation.
no code implementations • WS 2017 • Sharid Lo{\'a}iciga, Sara Stymne, Preslav Nakov, Christian Hardmeier, J{\"o}rg Tiedemann, Mauro Cettolo, Yannick Versley
We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual pronoun prediction.
no code implementations • 14 Dec 2016 • Mauro Cettolo, Mara Chinea Rios, Roldano Cattoni
In this paper, we report on domain clustering in the ambit of an adaptive MT architecture.
no code implementations • 3 Oct 2016 • Mauro Cettolo
We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT3, the Web inventory that repurposes the original content of the TED website in a way which is more convenient for MT researchers.
no code implementations • EMNLP 2016 • Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico
Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT).
no code implementations • LREC 2016 • Luisa Bentivogli, Mauro Cettolo, M. Amin Farajian, Marcello Federico
This paper presents WAGS (Word Alignment Gold Standard), a novel benchmark which allows extensive evaluation of WA tools on out-of-vocabulary (OOV) and rare words.
no code implementations • COLING 2014 • Marcello Federico, Nicola Bertoldi, Mauro Cettolo, Matteo Negri, Marco Turchi, Marco Trombetti, Aless Cattelan, ro, Antonio Farina, Domenico Lupinetti, Andrea Martines, Alberto Massidda, Holger Schwenk, Lo{\"\i}c Barrault, Frederic Blain, Philipp Koehn, Christian Buck, Ulrich Germann
no code implementations • LREC 2012 • Marcello Federico, Sebastian St{\"u}ker, Luisa Bentivogli, Michael Paul, Mauro Cettolo, Teresa Herrmann, Jan Niehues, Giovanni Moretti
We report here on the eighth evaluation campaign organized in 2011 by the IWSLT workshop series.