Search Results for author: Mauro Cettolo

Found 27 papers, 7 papers with code

Overview of the IWSLT 2017 Evaluation Campaign

no code implementations • IWSLT 2017 • Mauro Cettolo, Marcello Federico, Luisa Bentivogli, Jan Niehues, Sebastian Stüker, Katsuhito Sudoh, Koichiro Yoshino, Christian Federmann

The IWSLT 2017 evaluation campaign has organised three tasks.

Machine Translation Translation

Paper
Add Code

Extending the MuST-C Corpus for a Comparative Evaluation of Speech Translation Technology

no code implementations • EAMT 2022 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Matteo Negri, Marco Turchi

This project aimed at extending the test sets of the MuST-C speech translation (ST) corpus with new reference translations.

Machine Translation Translation

Paper
Add Code

Towards a methodology for evaluating automatic subtitling

no code implementations • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles.

Segmentation

Paper
Add Code

Post-editing in Automatic Subtitling: A Subtitlers’ perspective

1 code implementation • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

Subtitling tools are recently being adapted for post-editing by providing automatically generated subtitles, and featuring not only machine translation, but also automatic segmentation and synchronisation.

Machine Translation Translation

Paper
Code

Machine Translation Human Evaluation: an investigation of evaluation based on Post-Editing and its relation with Direct Assessment

no code implementations • IWSLT (EMNLP) 2018 • Luisa Bentivogli, Mauro Cettolo, Marcello Federico, Christian Federmann

In this paper we present an analysis of the two most prominent methodologies used for the human evaluation of MT quality, namely evaluation based on Post-Editing (PE) and evaluation based on Direct Assessment (DA).

Machine Translation

Paper
Add Code

The IWSLT 2018 Evaluation Campaign

no code implementations • IWSLT (EMNLP) 2018 • Jan Niehues, Rolando Cattoni, Sebastian Stüker, Mauro Cettolo, Marco Turchi, Marcello Federico

The International Workshop of Spoken Language Translation (IWSLT) 2018 Evaluation Campaign featured two tasks: low-resource machine translation and speech translation.

Machine Translation Translation

Paper
Add Code

The IWSLT 2016 Evaluation Campaign

no code implementations • IWSLT 2016 • Mauro Cettolo, Jan Niehues, Sebastian Stüker, Luisa Bentivogli, Rolando Cattoni, Marcello Federico

The IWSLT 2016 Evaluation Campaign featured two tasks: the translation of talks and the translation of video conference conversations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

1 code implementation • 24 Oct 2023 • Dennis Fucci, Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli

When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits.

Decoder Language Modelling

Paper
Code

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

1 code implementation • 10 Oct 2023 • Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Direct Speech Translation for Automatic Subtitling

1 code implementation • 27 Sep 2022 • Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi

Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.

Translation

Paper
Code

Evaluating Subtitle Segmentation for End-to-end Generation Systems

1 code implementation • LREC 2022 • Alina Karakanta, François Buet, Mauro Cettolo, François Yvon

Subtitle segmentation can be evaluated with sequence segmentation metrics against a human reference.

Segmentation

Paper
Code

Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?

no code implementations • ACL 2021 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi

Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.

Translation

Paper
Add Code

Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation

no code implementations • ICNLSP 2021 • Marco Gaido, Matteo Negri, Mauro Cettolo, Marco Turchi

The audio segmentation mismatch between training data and those seen at run-time is a major problem in direct speech translation.

Action Detection Activity Detection +2

Paper
Add Code

CTC-based Compression for Direct Speech Translation

1 code implementation • EACL 2021 • Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi

Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST).

Translation

Paper
Code

Contextualized Translation of Automatically Segmented Speech

1 code implementation • 5 Aug 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.

Segmentation Sentence +2

Paper
Code

Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction

no code implementations • WS 2016 • Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, Andrei Popescu-Belis

We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction.

Language Modelling POS

Paper
Add Code

A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation

no code implementations • COLING 2018 • Surafel M. Lakew, Mauro Cettolo, Marcello Federico

Motivated by this, our work (i) provides a quantitative and comparative analysis of the translations produced by bilingual, multilingual and zero-shot systems; (ii) investigates the translation quality of two of the currently dominant neural architectures in MT, which are the Recurrent and the Transformer ones; and (iii) quantitatively explores how the closeness between languages influences the zero-shot translation.

Machine Translation NMT +2

Paper
Add Code

Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction

no code implementations • WS 2017 • Sharid Lo{\'a}iciga, Sara Stymne, Preslav Nakov, Christian Hardmeier, J{\"o}rg Tiedemann, Mauro Cettolo, Yannick Versley

We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual pronoun prediction.

Language Modelling Machine Translation +2

Paper
Add Code

Unsupervised Clustering of Commercial Domains for Adaptive Machine Translation

no code implementations • 14 Dec 2016 • Mauro Cettolo, Mara Chinea Rios, Roldano Cattoni

In this paper, we report on domain clustering in the ambit of an adaptive MT architecture.

Clustering Machine Translation +1

Paper
Add Code

An Arabic-Hebrew parallel corpus of TED talks

no code implementations • 3 Oct 2016 • Mauro Cettolo

We describe an Arabic-Hebrew parallel corpus of TED talks built upon WIT3, the Web inventory that repurposes the original content of the TED website in a way which is more convenient for MT researchers.

Sentence

Paper
Add Code

Neural versus Phrase-Based Machine Translation Quality: a Case Study

no code implementations • EMNLP 2016 • Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico

Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT).

Machine Translation NMT +1

Paper
Add Code

WAGS: A Beautiful English-Italian Benchmark Supporting Word Alignment Evaluation on Rare Words

no code implementations • LREC 2016 • Luisa Bentivogli, Mauro Cettolo, M. Amin Farajian, Marcello Federico

This paper presents WAGS (Word Alignment Gold Standard), a novel benchmark which allows extensive evaluation of WA tools on out-of-vocabulary (OOV) and rare words.

Sentence Word Alignment