Search Results for author: Matteo Negri

Found 130 papers, 35 papers with code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Paper
Add Code

Is “moby dick” a Whale or a Bird? Named Entities and Terminology in Speech Translation

no code implementations • EMNLP 2021 • Marco Gaido, Susana Rodríguez, Matteo Negri, Luisa Bentivogli, Marco Turchi

Automatic translation systems are known to struggle with rare words.

Translation

Paper
Add Code

Translation Quality and Productivity: A Study on Rich Morphology Languages

no code implementations • MTSummit 2017 • Lucia Specia, Kim Harris, Frédéric Blain, Aljoscha Burchardt, Viviven Macketanz, Inguna Skadin, Matteo Negri, Marco Turchi

Translation

Paper
Add Code

FBK’s Multilingual Neural Machine Translation System for IWSLT 2017

no code implementations • IWSLT 2017 • Surafel M. Lakew, Quintino F. Lotito, Marco Turchi, Matteo Negri, Marcello Federico

Particularly, we focus on the four zero-shot directions and show how a multilingual model trained with small data can provide reasonable results.

Machine Translation Transfer Learning +1

Paper
Add Code

Findings of the WMT 2020 Shared Task on Automatic Post-Editing

no code implementations • WMT (EMNLP) 2020 • Rajen Chatterjee, Markus Freitag, Matteo Negri, Marco Turchi

Due to i) the different source/domain of data compared to the past (Wikipedia vs Information Technology), ii) the different quality of the initial translations to be corrected and iii) the introduction of a new language pair (English-Chinese), this year’s results are not directly comparable with last year’s round.

Automatic Post-Editing NMT

Paper
Add Code

Zero-Shot Neural Machine Translation with Self-Learning Cycle

no code implementations • MTSummit 2021 • Surafel M. Lakew, Matteo Negri, Marco Turchi

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource-rich conditions.

Machine Translation NMT +2

Paper
Add Code

Instance Selection for Online Automatic Post-Editing in a multi-domain scenario

no code implementations • AMTA 2016 • Rajen Chatterjee, Mihael Arcan, Matteo Negri, Marco Turchi

In recent years, several end-to-end online translation systems have been proposed to successfully incorporate human post-editing feedback in the translation workflow.

Automatic Post-Editing Information Retrieval +2

Paper
Add Code

The IWSLT 2019 Evaluation Campaign

no code implementations • EMNLP (IWSLT) 2019 • Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.

Translation

Paper
Add Code

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations • WMT (EMNLP) 2021 • Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

Paper
Add Code

Machine-oriented NMT Adaptation for Zero-shot NLP tasks: Comparing the Usefulness of Close and Distant Languages

no code implementations • VarDial (COLING) 2020 • Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

In this work, we tackle the problem in a multilingual setting where a single NMT model translates from multiple languages for downstream automatic processing in the target language.

Machine Translation NMT

Paper
Add Code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Paper
Add Code

Automatic Translation for Multiple NLP tasks: a Multi-task Approach to Machine-oriented NMT Adaptation

no code implementations • EAMT 2020 • Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

We address this problem by proposing a multi-task approach to machine-oriented NMT adaptation, which is capable to serve multiple downstream tasks with a single system.

Machine Translation NMT +1

Paper
Add Code

On the Dynamics of Gender Learning in Speech Translation

no code implementations • NAACL (GeBNLP) 2022 • Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

In this work, we contribute to such a line of inquiry by exploring the emergence of gender bias in Speech Translation (ST).

Translation

Paper
Add Code

Extending the MuST-C Corpus for a Comparative Evaluation of Speech Translation Technology

no code implementations • EAMT 2022 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Matteo Negri, Marco Turchi

This project aimed at extending the test sets of the MuST-C speech translation (ST) corpus with new reference translations.

Machine Translation Translation

Paper
Add Code

Towards a methodology for evaluating automatic subtitling

no code implementations • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles.

Segmentation

Paper
Add Code

Post-editing in Automatic Subtitling: A Subtitlers’ perspective

1 code implementation • EAMT 2022 • Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

Subtitling tools are recently being adapted for post-editing by providing automatically generated subtitles, and featuring not only machine translation, but also automatic segmentation and synchronisation.

Machine Translation Translation

Paper
Code

FBK’s Neural Machine Translation Systems for IWSLT 2016

no code implementations • IWSLT 2016 • M. Amin Farajian, Rajen Chatterjee, Costanza Conforti, Shahab Jalalvand, Vevake Balaraman, Mattia A. Di Gangi, Duygu Ataman, Marco Turchi, Matteo Negri, Marcello Federico

They leverage linguistic information such as lemmas and part-of-speech tags of the source words in the form of additional factors along with the words.

Machine Translation NMT +1

Paper
Add Code

Data Augmentation for End-to-End Speech Translation: FBK@IWSLT ‘19

no code implementations • EMNLP (IWSLT) 2019 • Mattia A. Di Gangi, Matteo Negri, Viet Nhat Nguyen, Amirhossein Tebbifakhr, Marco Turchi

On the training side, we focused on data augmentation techniques recently proposed for ST and automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

1 code implementation • 20 Feb 2024 • Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

The attention mechanism, a cornerstone of state-of-the-art neural models, faces computational hurdles in processing long sequences due to its quadratic complexity.

Automatic Speech Recognition Image Classification +3

Paper
Code

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

no code implementations • 19 Feb 2024 • Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

The field of natural language processing (NLP) has recently witnessed a transformative shift with the emergence of foundation models, particularly Large Language Models (LLMs) that have revolutionized text-based NLP.

Speech-to-Text Translation

Paper
Add Code

A Prompt Response to the Demand for Automatic Gender-Neutral Translation

no code implementations • 8 Feb 2024 • Beatrice Savoldi, Andrea Piergentili, Dennis Fucci, Matteo Negri, Luisa Bentivogli

Gender-neutral translation (GNT) that avoids biased and undue binary assumptions is a pivotal challenge for the creation of more inclusive translation technologies.

Machine Translation Translation

Paper
Add Code

Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES

1 code implementation • 30 Oct 2023 • Beatrice Savoldi, Marco Gaido, Matteo Negri, Luisa Bentivogli

As part of the WMT-2023 "Test suites" shared task, in this paper we summarize the results of two test suites evaluations: MuST-SHE-WMT23 and INES.

Fairness

Paper
Code

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

1 code implementation • 24 Oct 2023 • Dennis Fucci, Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli

When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits.

Language Modelling

Paper
Code

How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation

1 code implementation • 23 Oct 2023 • Marco Gaido, Dennis Fucci, Matteo Negri, Luisa Bentivogli

When translating from notional gender languages (e. g., English) into grammatical gender languages (e. g., Italian), the generated translation requires explicit gender assignments for various words, including those referring to the speaker.

Sentence Translation

Paper
Code

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

1 code implementation • 10 Oct 2023 • Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus

1 code implementation • 8 Oct 2023 • Andrea Piergentili, Beatrice Savoldi, Dennis Fucci, Matteo Negri, Luisa Bentivogli

Gender inequality is embedded in our communication practices and perpetuated in translation technologies.

Benchmarking Machine Translation +1

Paper
Code

Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023

1 code implementation • 27 Sep 2023 • Sara Papi, Marco Gaido, Matteo Negri

This paper describes the FBK's participation in the Simultaneous Translation and Automatic Subtitling tracks of the IWSLT 2023 Evaluation Campaign.

Translation

Paper
Code

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

1 code implementation • 19 May 2023 • Sara Papi, Marco Turchi, Matteo Negri

Attention is the core mechanism of today's most used architectures for natural language processing and has been analyzed from many perspectives, including its effectiveness for machine translation-related tasks.

Machine Translation Translation +1

Paper
Code

Storage and Learning phase transitions in the Random-Features Hopfield Model

no code implementations • 29 Mar 2023 • Matteo Negri, Clarissa Lauditi, Gabriele Perugini, Carlo Lucibello, Enrico Malatesta

The Hopfield model is a paradigmatic model of neural networks that has been analyzed for many decades in the statistical physics, neuroscience, and machine learning communities.

Retrieval

Paper
Add Code

When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP

no code implementations • 28 Mar 2023 • Sara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri

Despite its crucial role in research experiments, code correctness is often presumed only on the basis of the perceived quality of results.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Gender Neutralization for an Inclusive Machine Translation: from Theoretical Foundations to Open Challenges

no code implementations • 24 Jan 2023 • Andrea Piergentili, Dennis Fucci, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri

Gender inclusivity in language technologies has become a prominent research topic.

Machine Translation Translation

Paper
Add Code

Attention as a Guide for Simultaneous Speech Translation

1 code implementation • 15 Dec 2022 • Sara Papi, Matteo Negri, Marco Turchi

The study of the attention mechanism has sparked interest in many fields, such as language modeling and machine translation.

Language Modelling Machine Translation +1

Paper
Code

Joint Speech Translation and Named Entity Recognition

1 code implementation • 21 Oct 2022 • Marco Gaido, Sara Papi, Matteo Negri, Marco Turchi

Modern automatic translation systems aim at place the human at the center by providing contextual support and knowledge.

Computational Efficiency Entity Linking +4

Paper
Code

Multi-mode fiber reservoir computing overcomes shallow neural networks classifiers

no code implementations • 10 Oct 2022 • Daniele Ancora, Matteo Negri, Antonio Gianfrate, Dimitris Trypogeorgos, Lorenzo Dominici, Daniele Sanvitto, Federico Ricci-Tersenghi, Luca Leuzzi

In the field of disordered photonics, a common objective is to characterize optically opaque materials for controlling light delivery or performing imaging.

Paper
Add Code

Direct Speech Translation for Automatic Subtitling

1 code implementation • 27 Sep 2022 • Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi

Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.

Translation

Paper
Code

Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora

1 code implementation • 21 Sep 2022 • Sara Papi, Alina Karakanta, Matteo Negri, Marco Turchi

Speech translation for subtitling (SubST) is the task of automatically translating speech data into well-formed subtitles by inserting subtitle breaks compliant to specific displaying guidelines.

Translation

Paper
Code

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

1 code implementation • NAACL (AutoSimTrans) 2022 • Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL).

Translation

Paper
Code

Who Are We Talking About? Handling Person Names in Speech Translation

1 code implementation • IWSLT (ACL) 2022 • Marco Gaido, Matteo Negri, Marco Turchi

Recent work has shown that systems for speech translation (ST) -- similarly to automatic speech recognition (ASR) -- poorly handle person names.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

1 code implementation • IWSLT (ACL) 2022 • Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri, Marco Turchi

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality.

Sentence Translation

Paper
Code

Does Simultaneous Speech Translation need Simultaneous Models?

1 code implementation • 8 Apr 2022 • Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

In simultaneous speech translation (SimulST), finding the best trade-off between high translation quality and low latency is a challenging task.

Translation

Paper
Code

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

1 code implementation • ACL 2022 • Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages.

POS Translation

Paper
Code

Native state of natural proteins optimises local entropy

1 code implementation • 25 Nov 2021 • Matteo Negri, Guido Tiana, Riccardo Zecchina

The differing ability of polypeptide conformations to act as the native state of proteins has long been rationalized in terms of differing kinetic accessibility or thermodynamic stability.

Paper
Code

Visualization: the missing factor in Simultaneous Speech Translation

no code implementations • 31 Oct 2021 • Sara Papi, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) is the task in which output generation has to be performed on partial, incremental speech input.

Translation

Paper
Add Code

Is "moby dick" a Whale or a Bird? Named Entities and Terminology in Speech Translation

1 code implementation • 15 Sep 2021 • Marco Gaido, Susana Rodríguez, Matteo Negri, Luisa Bentivogli, Marco Turchi

Automatic translation systems are known to struggle with rare words.

Translation

Paper
Code

Speechformer: Reducing Information Loss in Direct Speech Translation

1 code implementation • EMNLP 2021 • Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Transformer-based models have gained increasing popularity achieving state-of-the-art performance in many research fields including speech translation.

Ranked #1 on Speech-to-Text Translation on MuST-C EN->NL

Speech-to-Text Translation Translation

Paper
Code

Simultaneous Speech Translation for Live Subtitling: from Delay to Display

1 code implementation • MTSummit 2021 • Alina Karakanta, Sara Papi, Matteo Negri, Marco Turchi

Experiments on three language pairs (en$\rightarrow$it, de, fr) show that scrolling lines is the only mode achieving an acceptable reading speed while keeping delay close to a 4-second threshold.

Translation

Paper
Code

Between Flexibility and Consistency: Joint Generation of Captions and Subtitles

1 code implementation • ACL (IWSLT) 2021 • Alina Karakanta, Marco Gaido, Matteo Negri, Marco Turchi

Speech translation (ST) has lately received growing interest for the generation of subtitles without the need for an intermediate source language transcription and timing (i. e. captions).

Translation

Paper
Code

Dealing with training and test segmentation mismatch: FBK@IWSLT2021

no code implementations • ACL (IWSLT) 2021 • Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Both knowledge distillation and the first fine-tuning step are carried out on manually segmented real and synthetic data, the latter being generated with an MT system trained on the available corpora.

Action Detection Activity Detection +4

Paper
Add Code

Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?

no code implementations • ACL 2021 • Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi

Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.

Translation

Paper
Add Code

How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation

1 code implementation • Findings (ACL) 2021 • Marco Gaido, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri, Marco Turchi

In light of this finding, we propose a combined approach that preserves BPE overall translation quality, while leveraging the higher ability of character-based segmentation to properly translate gender.

Segmentation Translation

Paper
Code

Beyond Voice Activity Detection: Hybrid Audio Segmentation for Direct Speech Translation

no code implementations • ICNLSP 2021 • Marco Gaido, Matteo Negri, Mauro Cettolo, Marco Turchi

The audio segmentation mismatch between training data and those seen at run-time is a major problem in direct speech translation.

Action Detection Activity Detection +2

Paper
Add Code

Gender Bias in Machine Translation

1 code implementation • 13 Apr 2021 • Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

Machine translation (MT) technology has facilitated our daily tasks by providing accessible shortcuts for gathering, elaborating and communicating information.

Machine Translation Translation

Paper
Code

Tutorial Proposal: End-to-End Speech Translation

no code implementations • EACL 2021 • Jan Niehues, Elizabeth Salesky, Marco Turchi, Matteo Negri

Speech translation is the translation of speech in one language typically to text in another, traditionally accomplished through a combination of automatic speech recognition and machine translation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Self-Learning for Zero Shot Neural Machine Translation

no code implementations • 10 Mar 2021 • Surafel M. Lakew, Matteo Negri, Marco Turchi

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource rich conditions.

Machine Translation NMT +2

Paper
Add Code

CTC-based Compression for Direct Speech Translation

1 code implementation • EACL 2021 • Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi

Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST).

Translation

Paper
Code

The Multilingual TEDx Corpus for Speech Recognition and Translation

no code implementations • 2 Feb 2021 • Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.

speech-recognition Speech Recognition +1

Paper
Add Code

On Knowledge Distillation for Direct Speech Translation

1 code implementation • 9 Dec 2020 • Marco Gaido, Mattia A. Di Gangi, Matteo Negri, Marco Turchi

Direct speech translation (ST) has shown to be a complex task requiring knowledge transfer from its sub-tasks: automatic speech recognition (ASR) and machine translation (MT).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Code

Breeding Gender-aware Direct Speech Translation Systems

no code implementations • COLING 2020 • Marco Gaido, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri, Marco Turchi

In particular, by translating speech audio data without intermediate transcription, direct ST models are able to leverage and preserve essential information present in the input (e. g. speaker's vocal characteristics) that is otherwise lost in the cascade framework.

Machine Translation Translation

Paper
Add Code

The Two Shades of Dubbing in Neural Machine Translation

no code implementations • COLING 2020 • Alina Karakanta, Supratik Bhattacharya, Shravan Nayak, Timo Baumann, Matteo Negri, Marco Turchi

Dubbing has two shades; synchronisation constraints are applied only when the actor{'}s mouth is visible on screen, while the translation is unconstrained for off-screen dubbing.

Machine Translation Translation +1

Paper
Add Code

Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures

no code implementations • 27 Oct 2020 • Carlo Baldassi, Enrico M. Malatesta, Matteo Negri, Riccardo Zecchina

We analyze the connection between minimizers with good generalizing properties and high local entropy regions of a threshold-linear classifier in Gaussian mixtures with the mean squared error loss function.

Paper
Add Code

On Target Segmentation for Direct Speech Translation

no code implementations • AMTA 2020 • Mattia Antonino Di Gangi, Marco Gaido, Matteo Negri, Marco Turchi

Then, subword-level segmentation became the state of the art in neural machine translation as it produces shorter sequences that reduce the training time, while being superior to word-level models.

Data Augmentation Machine Translation +2

Paper
Add Code

Contextualized Translation of Automatically Segmented Speech

1 code implementation • 5 Aug 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.

Segmentation Sentence +2

Paper
Code

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

no code implementations • WS 2020 • Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.

Translation

Paper
Add Code

Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

no code implementations • ACL 2020 • Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia Antonino Di Gangi, Roldano Cattoni, Marco Turchi

Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.

Machine Translation Sentence +1

Paper
Add Code

End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

no code implementations • WS 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation.

Data Augmentation Knowledge Distillation +3

Paper
Add Code

Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

no code implementations • WS 2020 • Alina Karakanta, Matteo Negri, Marco Turchi

Subtitling is becoming increasingly important for disseminating information, given the enormous amounts of audiovisual content becoming available daily.

Machine Translation NMT +1

Paper
Add Code

Low Resource Neural Machine Translation: A Benchmark for Five African Languages

1 code implementation • 31 Mar 2020 • Surafel M. Lakew, Matteo Negri, Marco Turchi

Recent advents in Neural Machine Translation (NMT) have shown improvements in low-resource language (LRL) translation tasks.

Low-Resource Neural Machine Translation NMT +2

Paper
Code

MuST-Cinema: a Speech-to-Subtitles corpus

no code implementations • LREC 2020 • Alina Karakanta, Matteo Negri, Marco Turchi

Growing needs in localising audiovisual content in multiple languages through subtitles call for the development of automatic solutions for human subtitling.

Machine Translation NMT +1

Paper
Add Code

Adapting Multilingual Neural Machine Translation to Unseen Languages

1 code implementation • EMNLP (IWSLT) 2019 • Surafel M. Lakew, Alina Karakanta, Marcello Federico, Matteo Negri, Marco Turchi

In order to improve NMT for LRL, we employ perplexity to select HRL data that are most similar to the LRL on the basis of language distance.

Data Augmentation Machine Translation +2

Paper
Code

Instance-Based Model Adaptation For Direct Speech Translation

no code implementations • 23 Oct 2019 • Mattia Antonino Di Gangi, Viet-Nhat Nguyen, Matteo Negri, Marco Turchi

Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora.

Domain Adaptation Speech-to-Text Translation +1

Paper
Add Code

One-To-Many Multilingual End-to-end Speech Translation

no code implementations • 8 Oct 2019 • Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

Multilingual solutions are widely studied in MT and usually rely on ``\textit{target forcing}'', in which multilingual parallel data are combined to train a single model by prepending to the input sequences a language token that specifies the target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Machine Translation for Machines: the Sentiment Classification Use Case

no code implementations • IJCNLP 2019 • Amirhossein Tebbifakhr, Luisa Bentivogli, Matteo Negri, Marco Turchi

Towards this objective, we present a reinforcement learning technique based on a new candidate sampling strategy, which exploits the results obtained on the downstream task as weak feedback.

Classification General Classification +7

Paper
Add Code

Natural representation of composite data with replicated autoencoders

no code implementations • 29 Sep 2019 • Matteo Negri, Davide Bergamini, Carlo Baldassi, Riccardo Zecchina, Christoph Feinauer

Generative processes in biology and other fields often produce data that can be regarded as resulting from a composition of basic features.

Paper
Add Code

Multilingual Neural Machine Translation for Zero-Resource Languages

1 code implementation • 16 Sep 2019 • Surafel M. Lakew, Marcello Federico, Matteo Negri, Marco Turchi

In recent years, Neural Machine Translation (NMT) has been shown to be more effective than phrase-based statistical methods, thus quickly becoming the state of the art in machine translation (MT).

Machine Translation NMT +1

Paper
Code

Effort-Aware Neural Automatic Post-Editing

no code implementations • WS 2019 • Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

For this purpose, following the common approach in multilingual NMT, we prepend a special token to the beginning of both the source text and the MT output indicating the required amount of post-editing.

Automatic Post-Editing NMT +1

Paper
Add Code

Enhancing Transformer for End-to-end Speech-to-Text Translation

no code implementations • WS 2019 • Mattia Antonino Di Gangi, Matteo Negri, Roldano Cattoni, Roberto Dessi, Marco Turchi

Speech-to-Text Translation Translation

Paper
Add Code

Improving Translations by Combining Fuzzy-Match Repair with Automatic Post-Editing

no code implementations • WS 2019 • John Ortega, Felipe S{\'a}nchez-Mart{\'\i}nez, Marco Turchi, Matteo Negri

Automatic Post-Editing

Paper
Add Code

Findings of the WMT 2019 Shared Task on Automatic Post-Editing

no code implementations • WS 2019 • Rajen Chatterjee, Christian Federmann, Matteo Negri, Marco Turchi

Seven teams participated in the English-German task, with a total of 18 submitted runs.

Automatic Post-Editing Translation

Paper
Add Code

Neural Text Simplification in Low-Resource Conditions Using Weak Supervision

no code implementations • WS 2019 • Alessio Palmero Aprosio, Sara Tonelli, Marco Turchi, Matteo Negri, Mattia A. Di Gangi

Inspired by the machine translation field, in which synthetic parallel pairs generated from monolingual data yield significant improvements to neural models, in this paper we exploit large amounts of heterogeneous data to automatically select simple sentences, which are then used to create synthetic simplification pairs.

Machine Translation Sentence +3

Paper
Add Code

MuST-C: a Multilingual Speech Translation Corpus

no code implementations • NAACL 2019 • Mattia A. Di Gangi, Roldano Cattoni, Luisa Bentivogli, Matteo Negri, Marco Turchi

Current research on spoken language translation (SLT) has to confront with the scarcity of sizeable and publicly available training corpora.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Improving Zero-Shot Translation of Low-Resource Languages

1 code implementation • IWSLT 2017 • Surafel M. Lakew, Quintino F. Lotito, Matteo Negri, Marco Turchi, Marcello Federico

Recent work on multilingual neural machine translation reported competitive performance with respect to bilingual models and surprisingly good performance even on (zeroshot) translation directions not observed at training time.

Machine Translation Translation

Paper
Code

Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary

2 code implementations • IWSLT (EMNLP) 2018 • Surafel M. Lakew, Aliia Erofeeva, Matteo Negri, Marcello Federico, Marco Turchi

Our approach allows to extend an initial model for a given language pair to cover new languages by adapting its vocabulary as long as new data become available (i. e., introducing new vocabulary items if they are not included in the initial model).

Machine Translation NMT +2

Paper
Code

Generating E-Commerce Product Titles and Predicting their Quality

no code implementations • WS 2018 • Jos{\'e} G. Camargo de Souza, Michael Kozielski, Prashant Mathur, Ernie Chang, Marco Guerini, Matteo Negri, Marco Turchi, Evgeny Matusov

The setting requires the generation process to be fast and the generated title to be both human-readable and concise.

Text Generation

Paper
Add Code

Fine-tuning on Clean Data for End-to-End Speech Translation: FBK @ IWSLT 2018

no code implementations • IWSLT (EMNLP) 2018 • Mattia Antonino Di Gangi, Roberto Dessì, Roldano Cattoni, Matteo Negri, Marco Turchi

This paper describes FBK's submission to the end-to-end English-German speech translation task at IWSLT 2018.

Machine Translation Translation

Paper
Add Code

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor

Machine Translation Translation

Paper
Add Code

Findings of the WMT 2018 Shared Task on Automatic Post-Editing

no code implementations • WS 2018 • Rajen Chatterjee, Matteo Negri, Raphael Rubino, Marco Turchi

In the former subtask, characterized by original translations of lower quality, top results achieved impressive improvements, up to -6. 24 TER and +9. 53 BLEU points over the baseline {``}\textit{do-nothing}{''} system.

Automatic Post-Editing NMT +1

Paper
Add Code

Multi-source transformer with combined losses for automatic post editing

no code implementations • WS 2018 • Amirhossein Tebbifakhr, Ruchit Agrawal, Matteo Negri, Marco Turchi

In the first subtask, our system improves over the baseline up to -5. 3 TER and +8. 23 BLEU points ranking second out of 11 submitted runs.

Automatic Post-Editing NMT +2

Paper
Add Code

eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

no code implementations • LREC 2018 • Matteo Negri, Marco Turchi, Rajen Chatterjee, Nicola Bertoldi

eSCAPE consists of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly-available parallel corpora, and using the target side as an artificial human post-edit.

Automatic Post-Editing Sentence

Paper
Add Code

Combining Quality Estimation and Automatic Post-editing to Enhance Machine Translation output

no code implementations • WS 2018 • Rajen Chatterjee, Matteo Negri, Marco Turchi, Fr{\'e}d{\'e}ric Blain, Lucia Specia

Automatic Post-Editing Translation

Paper
Add Code

Guiding Neural Machine Translation Decoding with External Knowledge

no code implementations • WS 2017 • Rajen Chatterjee, Matteo Negri, Marco Turchi, Marcello Federico, Lucia Specia, Fr{\'e}d{\'e}ric Blain

Machine Translation Translation

Paper
Add Code

Multi-Domain Neural Machine Translation through Unsupervised Adaptation

no code implementations • WS 2017 • M. Amin Farajian, Marco Turchi, Matteo Negri, Marcello Federico

Machine Translation Translation

Paper
Add Code

Findings of the 2017 Conference on Machine Translation (WMT17)

no code implementations • WS 2017 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Shu-Jian Huang, Matthias Huck, Philipp Koehn, Qun Liu, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Raphael Rubino, Lucia Specia, Marco Turchi

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Multi-source Neural Automatic Post-Editing: FBK's participation in the WMT 2017 APE shared task

no code implementations • WS 2017 • Rajen Chatterjee, M. Amin Farajian, Matteo Negri, Marco Turchi, Ankit Srivastava, Santanu Pal

Automatic Post-Editing Language Modelling

Paper
Add Code

Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English

no code implementations • 31 Jul 2017 • Duygu Ataman, Matteo Negri, Marco Turchi, Marcello Federico

In this paper, we propose a new vocabulary reduction method for NMT, which can reduce the vocabulary of a given input corpus at any rate while also considering the morphological properties of the language.

Machine Translation Morphological Analysis +2

Paper
Add Code

Automatic Quality Estimation for ASR System Combination

no code implementations • 22 Jun 2017 • Shahab Jalalvand, Matteo Negri, Daniele Falavigna, Marco Matassoni, Marco Turchi

In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

no code implementations • EACL 2017 • Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, Marco Turchi

Automatic post-editing (APE) for machine translation (MT) aims to fix recurrent errors made by the MT decoder by learning from correction examples.

Automatic Post-Editing Translation

Paper
Add Code

Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario

no code implementations • EACL 2017 • M. Amin Farajian, Marco Turchi, Matteo Negri, Nicola Bertoldi, Marcello Federico

State-of-the-art neural machine translation (NMT) systems are generally trained on specific domains by carefully selecting the training sets and applying proper domain adaptation techniques.

Domain Adaptation Machine Translation +2

Paper
Add Code

DNN adaptation by automatic quality estimation of ASR hypotheses

no code implementations • 6 Feb 2017 • Daniele Falavigna, Marco Matassoni, Shahab Jalalvand, Matteo Negri, Marco Turchi

Our hypothesis is that significant improvements can be achieved by: i)automatically transcribing the evaluation data we are currently trying to recognise, and ii) selecting from it a subset of "good quality" instances based on the word error rate (WER) scores predicted by a QE component.

Sentence

Paper
Add Code

Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi

Machine Translation Translation

Paper
Add Code

An Unsupervised Method for Automatic Translation Memory Cleaning

no code implementations • ACL 2016 • Masoud Jalili Sabet, Matteo Negri, Marco Turchi, Eduard Barbu

Machine Translation Translation

Paper
Add Code

The FBK Participation in the WMT 2016 Automatic Post-editing Shared Task

no code implementations • WS 2016 • Rajen Chatterjee, Jos{\'e} G. C. de Souza, Matteo Negri, Marco Turchi

Automatic Post-Editing Data Augmentation

Paper
Add Code

Findings of the 2016 Conference on Machine Translation

no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, Marcos Zampieri

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

TMop: a Tool for Unsupervised Translation Memory Cleaning

1 code implementation • ACL 2016 • Masoud Jalili Sabet, Matteo Negri, Marco Turchi, Jos{\'e} G. C. de Souza, Marcello Federico

Machine Translation Translation

Paper
Code

TranscRater: a Tool for Automatic Speech Recognition Quality Estimation

no code implementations • ACL 2016 • Shahab Jalalvand, Matteo Negri, Marco Turchi, JosÃ© G. C. de Souza, Falavigna Daniele, Mohammed R. H. Qwaider

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

FBK HLT-MT at SemEval-2016 Task 1: Cross-lingual Semantic Similarity Measurement Using Quality Estimation Features and Compositional Bilingual Word Embeddings

no code implementations • SEMEVAL 2016 • Duygu Ataman, Jos{\'e} G. C. de Souza, Marco Turchi, Matteo Negri

Cross-Lingual Semantic Textual Similarity Machine Translation +6

Paper
Add Code

Findings of the 2015 Workshop on Statistical Machine Translation

no code implementations • WS 2015 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Barry Haddow, Matthias Huck, Chris Hokamp, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Matt Post, Carolina Scarton, Lucia Specia, Marco Turchi

Automatic Post-Editing Translation

Paper
Add Code

The FBK Participation in the WMT15 Automatic Post-editing Shared Task

no code implementations • WS 2015 • Rajen Chatterjee, Marco Turchi, Matteo Negri

Automatic Post-Editing

Paper
Add Code

Exploring the Planet of the APEs: a Comparative Study of State-of-the-art Methods for MT Automatic Post-Editing

no code implementations • IJCNLP 2015 • Rajen Chatterjee, Marion Weller, Matteo Negri, Marco Turchi

Automatic Post-Editing Domain Adaptation

Paper
Add Code

MT Quality Estimation for Computer-assisted Translation: Does it Really Help?

no code implementations • IJCNLP 2015 • Marco Turchi, Matteo Negri, Marcello Federico

Machine Translation Translation

Paper
Add Code

Online Multitask Learning for Machine Translation Quality Estimation

1 code implementation • IJCNLP 2015 • Jos{\'e} G. C. de Souza, Matteo Negri, Elisa Ricci, Marco Turchi

Machine Translation Translation

Paper
Code

Driving ROVER with Segment-based ASR Quality Estimation

no code implementations • ACL 2015 • Matteo Negri, Marco Turchi, Falavigna Daniele, Shahab Jalalvand

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Multitask Learning for Adaptive Quality Estimation of Automatically Transcribed Utterances

no code implementations • HLT 2015 • Matteo Negri, José G. C. de Souza, Marco Turchi, Falavigna Daniele, Hamed Zamani

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Assessing the Impact of Translation Errors on Machine Translation Quality with Mixed-effects Models

no code implementations • EMNLP 2014 • Marcello Federico, Matteo Negri, Luisa Bentivogli, Marco Turchi

Machine Translation Translation

Paper
Add Code

Machine Translation Quality Estimation Across Domains

no code implementations • COLING 2014 • Jos{\'e} G. C. de Souza, Marco Turchi, Matteo Negri

Machine Translation Translation

Paper
Add Code

Quality Estimation for Automatic Speech Recognition

no code implementations • COLING 2014 • Matteo Negri, Marco Turchi, Jos{\'e} G. C. de Souza, Daniele Falavigna

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

The MateCat Tool

no code implementations • COLING 2014 • Marcello Federico, Nicola Bertoldi, Mauro Cettolo, Matteo Negri, Marco Turchi, Marco Trombetti, Aless Cattelan, ro, Antonio Farina, Domenico Lupinetti, Andrea Martines, Alberto Massidda, Holger Schwenk, Lo{\"\i}c Barrault, Frederic Blain, Philipp Koehn, Christian Buck, Ulrich Germann

Machine Translation

Paper
Add Code

FBK-UPV-UEdin participation in the WMT14 Quality Estimation shared-task

no code implementations • WS 2014 • Jos{\'e} Guilherme Camargo de Souza, Jes{\'u}s Gonz{\'a}lez-Rubio, Christian Buck, Marco Turchi, Matteo Negri

Language Modelling Machine Translation

Paper
Add Code

Adaptive Quality Estimation for Machine Translation

no code implementations • ACL 2014 • Marco Turchi, Antonios Anastasopoulos, Jos{\'e} G. C. de Souza, Matteo Negri

Machine Translation Translation

Paper
Add Code

Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements

no code implementations • LREC 2014 • Marco Turchi, Matteo Negri

To overcome these issues, we present an automatic method for the annotation of (source, target) pairs with binary judgements that reflect an empirical, and easily interpretable notion of quality.

Machine Translation Re-Ranking +2