Search Results for author: Matteo Negri

Found 130 papers, 35 papers with code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

FBK’s Multilingual Neural Machine Translation System for IWSLT 2017

no code implementations IWSLT 2017 Surafel M. Lakew, Quintino F. Lotito, Marco Turchi, Matteo Negri, Marcello Federico

Particularly, we focus on the four zero-shot directions and show how a multilingual model trained with small data can provide reasonable results.

Machine Translation Transfer Learning +1

Findings of the WMT 2020 Shared Task on Automatic Post-Editing

no code implementations WMT (EMNLP) 2020 Rajen Chatterjee, Markus Freitag, Matteo Negri, Marco Turchi

Due to i) the different source/domain of data compared to the past (Wikipedia vs Information Technology), ii) the different quality of the initial translations to be corrected and iii) the introduction of a new language pair (English-Chinese), this year’s results are not directly comparable with last year’s round.

Automatic Post-Editing NMT

Zero-Shot Neural Machine Translation with Self-Learning Cycle

no code implementations MTSummit 2021 Surafel M. Lakew, Matteo Negri, Marco Turchi

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource-rich conditions.

Machine Translation NMT +2

Instance Selection for Online Automatic Post-Editing in a multi-domain scenario

no code implementations AMTA 2016 Rajen Chatterjee, Mihael Arcan, Matteo Negri, Marco Turchi

In recent years, several end-to-end online translation systems have been proposed to successfully incorporate human post-editing feedback in the translation workflow.

Automatic Post-Editing Information Retrieval +2

The IWSLT 2019 Evaluation Campaign

no code implementations EMNLP (IWSLT) 2019 Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.

Translation

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations WMT (EMNLP) 2021 Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

Machine-oriented NMT Adaptation for Zero-shot NLP tasks: Comparing the Usefulness of Close and Distant Languages

no code implementations VarDial (COLING) 2020 Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

In this work, we tackle the problem in a multilingual setting where a single NMT model translates from multiple languages for downstream automatic processing in the target language.

Machine Translation NMT

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Automatic Translation for Multiple NLP tasks: a Multi-task Approach to Machine-oriented NMT Adaptation

no code implementations EAMT 2020 Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

We address this problem by proposing a multi-task approach to machine-oriented NMT adaptation, which is capable to serve multiple downstream tasks with a single system.

Machine Translation NMT +1

On the Dynamics of Gender Learning in Speech Translation

no code implementations NAACL (GeBNLP) 2022 Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

In this work, we contribute to such a line of inquiry by exploring the emergence of gender bias in Speech Translation (ST).

Translation

Towards a methodology for evaluating automatic subtitling

no code implementations EAMT 2022 Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles.

Segmentation

Post-editing in Automatic Subtitling: A Subtitlers’ perspective

1 code implementation EAMT 2022 Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

Subtitling tools are recently being adapted for post-editing by providing automatically generated subtitles, and featuring not only machine translation, but also automatic segmentation and synchronisation.

Machine Translation Translation

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

1 code implementation20 Feb 2024 Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

The attention mechanism, a cornerstone of state-of-the-art neural models, faces computational hurdles in processing long sequences due to its quadratic complexity.

Automatic Speech Recognition Image Classification +3

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

no code implementations19 Feb 2024 Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

The field of natural language processing (NLP) has recently witnessed a transformative shift with the emergence of foundation models, particularly Large Language Models (LLMs) that have revolutionized text-based NLP.

Speech-to-Text Translation

A Prompt Response to the Demand for Automatic Gender-Neutral Translation

no code implementations8 Feb 2024 Beatrice Savoldi, Andrea Piergentili, Dennis Fucci, Matteo Negri, Luisa Bentivogli

Gender-neutral translation (GNT) that avoids biased and undue binary assumptions is a pivotal challenge for the creation of more inclusive translation technologies.

Machine Translation Translation

Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES

1 code implementation30 Oct 2023 Beatrice Savoldi, Marco Gaido, Matteo Negri, Luisa Bentivogli

As part of the WMT-2023 "Test suites" shared task, in this paper we summarize the results of two test suites evaluations: MuST-SHE-WMT23 and INES.

Fairness

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

1 code implementation24 Oct 2023 Dennis Fucci, Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli

When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits.

Language Modelling

How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation

1 code implementation23 Oct 2023 Marco Gaido, Dennis Fucci, Matteo Negri, Luisa Bentivogli

When translating from notional gender languages (e. g., English) into grammatical gender languages (e. g., Italian), the generated translation requires explicit gender assignments for various words, including those referring to the speaker.

Sentence Translation

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

1 code implementation10 Oct 2023 Dennis Fucci, Marco Gaido, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023

1 code implementation27 Sep 2023 Sara Papi, Marco Gaido, Matteo Negri

This paper describes the FBK's participation in the Simultaneous Translation and Automatic Subtitling tracks of the IWSLT 2023 Evaluation Campaign.

Translation

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

1 code implementation19 May 2023 Sara Papi, Marco Turchi, Matteo Negri

Attention is the core mechanism of today's most used architectures for natural language processing and has been analyzed from many perspectives, including its effectiveness for machine translation-related tasks.

Machine Translation Translation +1

Storage and Learning phase transitions in the Random-Features Hopfield Model

no code implementations29 Mar 2023 Matteo Negri, Clarissa Lauditi, Gabriele Perugini, Carlo Lucibello, Enrico Malatesta

The Hopfield model is a paradigmatic model of neural networks that has been analyzed for many decades in the statistical physics, neuroscience, and machine learning communities.

Retrieval

When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP

no code implementations28 Mar 2023 Sara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri

Despite its crucial role in research experiments, code correctness is often presumed only on the basis of the perceived quality of results.

Automatic Speech Recognition speech-recognition +1

Attention as a Guide for Simultaneous Speech Translation

1 code implementation15 Dec 2022 Sara Papi, Matteo Negri, Marco Turchi

The study of the attention mechanism has sparked interest in many fields, such as language modeling and machine translation.

Language Modelling Machine Translation +1

Joint Speech Translation and Named Entity Recognition

1 code implementation21 Oct 2022 Marco Gaido, Sara Papi, Matteo Negri, Marco Turchi

Modern automatic translation systems aim at place the human at the center by providing contextual support and knowledge.

Computational Efficiency Entity Linking +4

Multi-mode fiber reservoir computing overcomes shallow neural networks classifiers

no code implementations10 Oct 2022 Daniele Ancora, Matteo Negri, Antonio Gianfrate, Dimitris Trypogeorgos, Lorenzo Dominici, Daniele Sanvitto, Federico Ricci-Tersenghi, Luca Leuzzi

In the field of disordered photonics, a common objective is to characterize optically opaque materials for controlling light delivery or performing imaging.

Direct Speech Translation for Automatic Subtitling

1 code implementation27 Sep 2022 Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi

Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.

Translation

Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora

1 code implementation21 Sep 2022 Sara Papi, Alina Karakanta, Matteo Negri, Marco Turchi

Speech translation for subtitling (SubST) is the task of automatically translating speech data into well-formed subtitles by inserting subtitle breaks compliant to specific displaying guidelines.

Translation

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

1 code implementation NAACL (AutoSimTrans) 2022 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL).

Translation

Who Are We Talking About? Handling Person Names in Speech Translation

1 code implementation IWSLT (ACL) 2022 Marco Gaido, Matteo Negri, Marco Turchi

Recent work has shown that systems for speech translation (ST) -- similarly to automatic speech recognition (ASR) -- poorly handle person names.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

1 code implementation IWSLT (ACL) 2022 Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri, Marco Turchi

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality.

Sentence Translation

Does Simultaneous Speech Translation need Simultaneous Models?

1 code implementation8 Apr 2022 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

In simultaneous speech translation (SimulST), finding the best trade-off between high translation quality and low latency is a challenging task.

Translation

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

1 code implementation ACL 2022 Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages.

POS Translation

Native state of natural proteins optimises local entropy

1 code implementation25 Nov 2021 Matteo Negri, Guido Tiana, Riccardo Zecchina

The differing ability of polypeptide conformations to act as the native state of proteins has long been rationalized in terms of differing kinetic accessibility or thermodynamic stability.

Visualization: the missing factor in Simultaneous Speech Translation

no code implementations31 Oct 2021 Sara Papi, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) is the task in which output generation has to be performed on partial, incremental speech input.

Translation

Speechformer: Reducing Information Loss in Direct Speech Translation

1 code implementation EMNLP 2021 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Transformer-based models have gained increasing popularity achieving state-of-the-art performance in many research fields including speech translation.

Speech-to-Text Translation Translation

Simultaneous Speech Translation for Live Subtitling: from Delay to Display

1 code implementation MTSummit 2021 Alina Karakanta, Sara Papi, Matteo Negri, Marco Turchi

Experiments on three language pairs (en$\rightarrow$it, de, fr) show that scrolling lines is the only mode achieving an acceptable reading speed while keeping delay close to a 4-second threshold.

Translation

Between Flexibility and Consistency: Joint Generation of Captions and Subtitles

1 code implementation ACL (IWSLT) 2021 Alina Karakanta, Marco Gaido, Matteo Negri, Marco Turchi

Speech translation (ST) has lately received growing interest for the generation of subtitles without the need for an intermediate source language transcription and timing (i. e. captions).

Translation

Dealing with training and test segmentation mismatch: FBK@IWSLT2021

no code implementations ACL (IWSLT) 2021 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Both knowledge distillation and the first fine-tuning step are carried out on manually segmented real and synthetic data, the latter being generated with an MT system trained on the available corpora.

Action Detection Activity Detection +4

Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?

no code implementations ACL 2021 Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi

Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.

Translation

How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation

1 code implementation Findings (ACL) 2021 Marco Gaido, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri, Marco Turchi

In light of this finding, we propose a combined approach that preserves BPE overall translation quality, while leveraging the higher ability of character-based segmentation to properly translate gender.

Segmentation Translation

Gender Bias in Machine Translation

1 code implementation13 Apr 2021 Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

Machine translation (MT) technology has facilitated our daily tasks by providing accessible shortcuts for gathering, elaborating and communicating information.

Machine Translation Translation

Tutorial Proposal: End-to-End Speech Translation

no code implementations EACL 2021 Jan Niehues, Elizabeth Salesky, Marco Turchi, Matteo Negri

Speech translation is the translation of speech in one language typically to text in another, traditionally accomplished through a combination of automatic speech recognition and machine translation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Self-Learning for Zero Shot Neural Machine Translation

no code implementations10 Mar 2021 Surafel M. Lakew, Matteo Negri, Marco Turchi

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource rich conditions.

Machine Translation NMT +2

CTC-based Compression for Direct Speech Translation

1 code implementation EACL 2021 Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi

Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST).

Translation

The Multilingual TEDx Corpus for Speech Recognition and Translation

no code implementations2 Feb 2021 Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.

speech-recognition Speech Recognition +1

On Knowledge Distillation for Direct Speech Translation

1 code implementation9 Dec 2020 Marco Gaido, Mattia A. Di Gangi, Matteo Negri, Marco Turchi

Direct speech translation (ST) has shown to be a complex task requiring knowledge transfer from its sub-tasks: automatic speech recognition (ASR) and machine translation (MT).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Breeding Gender-aware Direct Speech Translation Systems

no code implementations COLING 2020 Marco Gaido, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri, Marco Turchi

In particular, by translating speech audio data without intermediate transcription, direct ST models are able to leverage and preserve essential information present in the input (e. g. speaker's vocal characteristics) that is otherwise lost in the cascade framework.

Machine Translation Translation

The Two Shades of Dubbing in Neural Machine Translation

no code implementations COLING 2020 Alina Karakanta, Supratik Bhattacharya, Shravan Nayak, Timo Baumann, Matteo Negri, Marco Turchi

Dubbing has two shades; synchronisation constraints are applied only when the actor{'}s mouth is visible on screen, while the translation is unconstrained for off-screen dubbing.

Machine Translation Translation +1

Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures

no code implementations27 Oct 2020 Carlo Baldassi, Enrico M. Malatesta, Matteo Negri, Riccardo Zecchina

We analyze the connection between minimizers with good generalizing properties and high local entropy regions of a threshold-linear classifier in Gaussian mixtures with the mean squared error loss function.

On Target Segmentation for Direct Speech Translation

no code implementations AMTA 2020 Mattia Antonino Di Gangi, Marco Gaido, Matteo Negri, Marco Turchi

Then, subword-level segmentation became the state of the art in neural machine translation as it produces shorter sequences that reduce the training time, while being superior to word-level models.

Data Augmentation Machine Translation +2

Contextualized Translation of Automatically Segmented Speech

1 code implementation5 Aug 2020 Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.

Segmentation Sentence +2

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

no code implementations WS 2020 Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.

Translation

End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

no code implementations WS 2020 Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation.

Data Augmentation Knowledge Distillation +3

Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

no code implementations WS 2020 Alina Karakanta, Matteo Negri, Marco Turchi

Subtitling is becoming increasingly important for disseminating information, given the enormous amounts of audiovisual content becoming available daily.

Machine Translation NMT +1

Low Resource Neural Machine Translation: A Benchmark for Five African Languages

1 code implementation31 Mar 2020 Surafel M. Lakew, Matteo Negri, Marco Turchi

Recent advents in Neural Machine Translation (NMT) have shown improvements in low-resource language (LRL) translation tasks.

Low-Resource Neural Machine Translation NMT +2

MuST-Cinema: a Speech-to-Subtitles corpus

no code implementations LREC 2020 Alina Karakanta, Matteo Negri, Marco Turchi

Growing needs in localising audiovisual content in multiple languages through subtitles call for the development of automatic solutions for human subtitling.

Machine Translation NMT +1

Instance-Based Model Adaptation For Direct Speech Translation

no code implementations23 Oct 2019 Mattia Antonino Di Gangi, Viet-Nhat Nguyen, Matteo Negri, Marco Turchi

Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora.

Domain Adaptation Speech-to-Text Translation +1

One-To-Many Multilingual End-to-end Speech Translation

no code implementations8 Oct 2019 Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

Multilingual solutions are widely studied in MT and usually rely on ``\textit{target forcing}'', in which multilingual parallel data are combined to train a single model by prepending to the input sequences a language token that specifies the target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Machine Translation for Machines: the Sentiment Classification Use Case

no code implementations IJCNLP 2019 Amirhossein Tebbifakhr, Luisa Bentivogli, Matteo Negri, Marco Turchi

Towards this objective, we present a reinforcement learning technique based on a new candidate sampling strategy, which exploits the results obtained on the downstream task as weak feedback.

Classification General Classification +7

Natural representation of composite data with replicated autoencoders

no code implementations29 Sep 2019 Matteo Negri, Davide Bergamini, Carlo Baldassi, Riccardo Zecchina, Christoph Feinauer

Generative processes in biology and other fields often produce data that can be regarded as resulting from a composition of basic features.

Multilingual Neural Machine Translation for Zero-Resource Languages

1 code implementation16 Sep 2019 Surafel M. Lakew, Marcello Federico, Matteo Negri, Marco Turchi

In recent years, Neural Machine Translation (NMT) has been shown to be more effective than phrase-based statistical methods, thus quickly becoming the state of the art in machine translation (MT).

Machine Translation NMT +1

Effort-Aware Neural Automatic Post-Editing

no code implementations WS 2019 Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

For this purpose, following the common approach in multilingual NMT, we prepend a special token to the beginning of both the source text and the MT output indicating the required amount of post-editing.

Automatic Post-Editing NMT +1

Neural Text Simplification in Low-Resource Conditions Using Weak Supervision

no code implementations WS 2019 Alessio Palmero Aprosio, Sara Tonelli, Marco Turchi, Matteo Negri, Mattia A. Di Gangi

Inspired by the machine translation field, in which synthetic parallel pairs generated from monolingual data yield significant improvements to neural models, in this paper we exploit large amounts of heterogeneous data to automatically select simple sentences, which are then used to create synthetic simplification pairs.

Machine Translation Sentence +3

Improving Zero-Shot Translation of Low-Resource Languages

1 code implementation IWSLT 2017 Surafel M. Lakew, Quintino F. Lotito, Matteo Negri, Marco Turchi, Marcello Federico

Recent work on multilingual neural machine translation reported competitive performance with respect to bilingual models and surprisingly good performance even on (zeroshot) translation directions not observed at training time.

Machine Translation Translation

Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary

2 code implementations IWSLT (EMNLP) 2018 Surafel M. Lakew, Aliia Erofeeva, Matteo Negri, Marcello Federico, Marco Turchi

Our approach allows to extend an initial model for a given language pair to cover new languages by adapting its vocabulary as long as new data become available (i. e., introducing new vocabulary items if they are not included in the initial model).

Machine Translation NMT +2

Findings of the WMT 2018 Shared Task on Automatic Post-Editing

no code implementations WS 2018 Rajen Chatterjee, Matteo Negri, Raphael Rubino, Marco Turchi

In the former subtask, characterized by original translations of lower quality, top results achieved impressive improvements, up to -6. 24 TER and +9. 53 BLEU points over the baseline {``}\textit{do-nothing}{''} system.

Automatic Post-Editing NMT +1

Multi-source transformer with combined losses for automatic post editing

no code implementations WS 2018 Amirhossein Tebbifakhr, Ruchit Agrawal, Matteo Negri, Marco Turchi

In the first subtask, our system improves over the baseline up to -5. 3 TER and +8. 23 BLEU points ranking second out of 11 submitted runs.

Automatic Post-Editing NMT +2

eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

no code implementations LREC 2018 Matteo Negri, Marco Turchi, Rajen Chatterjee, Nicola Bertoldi

eSCAPE consists of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly-available parallel corpora, and using the target side as an artificial human post-edit.

Automatic Post-Editing Sentence

Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English

no code implementations31 Jul 2017 Duygu Ataman, Matteo Negri, Marco Turchi, Marcello Federico

In this paper, we propose a new vocabulary reduction method for NMT, which can reduce the vocabulary of a given input corpus at any rate while also considering the morphological properties of the language.

Machine Translation Morphological Analysis +2

Automatic Quality Estimation for ASR System Combination

no code implementations22 Jun 2017 Shahab Jalalvand, Matteo Negri, Daniele Falavigna, Marco Matassoni, Marco Turchi

In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

no code implementations EACL 2017 Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, Marco Turchi

Automatic post-editing (APE) for machine translation (MT) aims to fix recurrent errors made by the MT decoder by learning from correction examples.

Automatic Post-Editing Translation

Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario

no code implementations EACL 2017 M. Amin Farajian, Marco Turchi, Matteo Negri, Nicola Bertoldi, Marcello Federico

State-of-the-art neural machine translation (NMT) systems are generally trained on specific domains by carefully selecting the training sets and applying proper domain adaptation techniques.

Domain Adaptation Machine Translation +2

DNN adaptation by automatic quality estimation of ASR hypotheses

no code implementations6 Feb 2017 Daniele Falavigna, Marco Matassoni, Shahab Jalalvand, Matteo Negri, Marco Turchi

Our hypothesis is that significant improvements can be achieved by: i)automatically transcribing the evaluation data we are currently trying to recognise, and ii) selecting from it a subset of "good quality" instances based on the word error rate (WER) scores predicted by a QE component.

Sentence

Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements

no code implementations LREC 2014 Marco Turchi, Matteo Negri

To overcome these issues, we present an automatic method for the annotation of (source, target) pairs with binary judgements that reflect an empirical, and easily interpretable notion of quality.

Machine Translation Re-Ranking +2

Cannot find the paper you are looking for? You can Submit a new open access paper.