Search Results for author: Marco Turchi

Found 121 papers, 27 papers with code

Automatic Translation for Multiple NLP tasks: a Multi-task Approach to Machine-oriented NMT Adaptation

no code implementations EAMT 2020 Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

We address this problem by proposing a multi-task approach to machine-oriented NMT adaptation, which is capable to serve multiple downstream tasks with a single system.

Machine Translation NMT +1

Machine-oriented NMT Adaptation for Zero-shot NLP tasks: Comparing the Usefulness of Close and Distant Languages

no code implementations VarDial (COLING) 2020 Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

In this work, we tackle the problem in a multilingual setting where a single NMT model translates from multiple languages for downstream automatic processing in the target language.

Machine Translation NMT

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

CEF Data Marketplace: Powering a Long-term Supply of Language Data

no code implementations EAMT 2020 Amir Kamran, Dace Dzeguze, Jaap van der Meer, Milica Panic, Alessandro Cattelan, Daniele Patrioli, Luisa Bentivogli, Marco Turchi

We describe the CEF Data Marketplace project, which focuses on the development of a trading platform of translation data for language professionals: translators, machine translation (MT) developers, language service providers (LSPs), translation buyers and government bodies.

Machine Translation Translation

FBK’s Multilingual Neural Machine Translation System for IWSLT 2017

no code implementations IWSLT 2017 Surafel M. Lakew, Quintino F. Lotito, Marco Turchi, Matteo Negri, Marcello Federico

Particularly, we focus on the four zero-shot directions and show how a multilingual model trained with small data can provide reasonable results.

Machine Translation Transfer Learning +1

Instance Selection for Online Automatic Post-Editing in a multi-domain scenario

no code implementations AMTA 2016 Rajen Chatterjee, Mihael Arcan, Matteo Negri, Marco Turchi

In recent years, several end-to-end online translation systems have been proposed to successfully incorporate human post-editing feedback in the translation workflow.

Automatic Post-Editing Information Retrieval +2

Findings of the WMT 2020 Shared Task on Automatic Post-Editing

no code implementations WMT (EMNLP) 2020 Rajen Chatterjee, Markus Freitag, Matteo Negri, Marco Turchi

Due to i) the different source/domain of data compared to the past (Wikipedia vs Information Technology), ii) the different quality of the initial translations to be corrected and iii) the introduction of a new language pair (English-Chinese), this year’s results are not directly comparable with last year’s round.

Automatic Post-Editing NMT

Towards a methodology for evaluating automatic subtitling

no code implementations EAMT 2022 Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles.

Segmentation

Post-editing in Automatic Subtitling: A Subtitlers’ perspective

1 code implementation EAMT 2022 Alina Karakanta, Luisa Bentivogli, Mauro Cettolo, Matteo Negri, Marco Turchi

Subtitling tools are recently being adapted for post-editing by providing automatically generated subtitles, and featuring not only machine translation, but also automatic segmentation and synchronisation.

Machine Translation Translation

The IWSLT 2018 Evaluation Campaign

no code implementations IWSLT (EMNLP) 2018 Jan Niehues, Rolando Cattoni, Sebastian Stüker, Mauro Cettolo, Marco Turchi, Marcello Federico

The International Workshop of Spoken Language Translation (IWSLT) 2018 Evaluation Campaign featured two tasks: low-resource machine translation and speech translation.

Machine Translation Translation

Zero-Shot Neural Machine Translation with Self-Learning Cycle

no code implementations MTSummit 2021 Surafel M. Lakew, Matteo Negri, Marco Turchi

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource-rich conditions.

Machine Translation NMT +2

The IWSLT 2019 Evaluation Campaign

no code implementations EMNLP (IWSLT) 2019 Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loic Barrault, Lucia Specia, Marcello Federico

The IWSLT 2019 evaluation campaign featured three tasks: speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech.

Translation

Findings of the 2021 Conference on Machine Translation (WMT21)

no code implementations WMT (EMNLP) 2021 Farhad Akhbardeh, Arkady Arkhangorodsky, Magdalena Biesialska, Ondřej Bojar, Rajen Chatterjee, Vishrav Chaudhary, Marta R. Costa-Jussa, Cristina España-Bonet, Angela Fan, Christian Federmann, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Leonie Harter, Kenneth Heafield, Christopher Homan, Matthias Huck, Kwabena Amponsah-Kaakyire, Jungo Kasai, Daniel Khashabi, Kevin Knight, Tom Kocmi, Philipp Koehn, Nicholas Lourie, Christof Monz, Makoto Morishita, Masaaki Nagata, Ajay Nagesh, Toshiaki Nakazawa, Matteo Negri, Santanu Pal, Allahsera Auguste Tapo, Marco Turchi, Valentin Vydrin, Marcos Zampieri

This paper presents the results of the newstranslation task, the multilingual low-resourcetranslation for Indo-European languages, thetriangular translation task, and the automaticpost-editing task organised as part of the Con-ference on Machine Translation (WMT) 2021. In the news task, participants were asked tobuild machine translation systems for any of10 language pairs, to be evaluated on test setsconsisting mainly of news stories.

Machine Translation Translation

On the Dynamics of Gender Learning in Speech Translation

no code implementations NAACL (GeBNLP) 2022 Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

In this work, we contribute to such a line of inquiry by exploring the emergence of gender bias in Speech Translation (ST).

Translation

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

1 code implementation19 May 2023 Sara Papi, Marco Turchi, Matteo Negri

Attention is the core mechanism of today's most used architectures for natural language processing and has been analyzed from many perspectives, including its effectiveness for machine translation-related tasks.

Machine Translation Translation +1

Attention as a Guide for Simultaneous Speech Translation

1 code implementation15 Dec 2022 Sara Papi, Matteo Negri, Marco Turchi

The study of the attention mechanism has sparked interest in many fields, such as language modeling and machine translation.

Language Modelling Machine Translation +1

Joint Speech Translation and Named Entity Recognition

1 code implementation21 Oct 2022 Marco Gaido, Sara Papi, Matteo Negri, Marco Turchi

Modern automatic translation systems aim at place the human at the center by providing contextual support and knowledge.

Computational Efficiency Entity Linking +4

Direct Speech Translation for Automatic Subtitling

1 code implementation27 Sep 2022 Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi

Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.

Translation

Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora

1 code implementation21 Sep 2022 Sara Papi, Alina Karakanta, Matteo Negri, Marco Turchi

Speech translation for subtitling (SubST) is the task of automatically translating speech data into well-formed subtitles by inserting subtitle breaks compliant to specific displaying guidelines.

Translation

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

1 code implementation NAACL (AutoSimTrans) 2022 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL).

Translation

Who Are We Talking About? Handling Person Names in Speech Translation

1 code implementation IWSLT (ACL) 2022 Marco Gaido, Matteo Negri, Marco Turchi

Recent work has shown that systems for speech translation (ST) -- similarly to automatic speech recognition (ASR) -- poorly handle person names.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

1 code implementation IWSLT (ACL) 2022 Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri, Marco Turchi

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality.

Sentence Translation

Does Simultaneous Speech Translation need Simultaneous Models?

1 code implementation8 Apr 2022 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

In simultaneous speech translation (SimulST), finding the best trade-off between high translation quality and low latency is a challenging task.

Translation

Under the Morphosyntactic Lens: A Multifaceted Evaluation of Gender Bias in Speech Translation

1 code implementation ACL 2022 Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

Gender bias is largely recognized as a problematic phenomenon affecting language technologies, with recent studies underscoring that it might surface differently across languages.

POS Translation

Visualization: the missing factor in Simultaneous Speech Translation

no code implementations31 Oct 2021 Sara Papi, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) is the task in which output generation has to be performed on partial, incremental speech input.

Translation

Speechformer: Reducing Information Loss in Direct Speech Translation

1 code implementation EMNLP 2021 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Transformer-based models have gained increasing popularity achieving state-of-the-art performance in many research fields including speech translation.

Speech-to-Text Translation Translation

Simultaneous Speech Translation for Live Subtitling: from Delay to Display

1 code implementation MTSummit 2021 Alina Karakanta, Sara Papi, Matteo Negri, Marco Turchi

Experiments on three language pairs (en$\rightarrow$it, de, fr) show that scrolling lines is the only mode achieving an acceptable reading speed while keeping delay close to a 4-second threshold.

Translation

Between Flexibility and Consistency: Joint Generation of Captions and Subtitles

1 code implementation ACL (IWSLT) 2021 Alina Karakanta, Marco Gaido, Matteo Negri, Marco Turchi

Speech translation (ST) has lately received growing interest for the generation of subtitles without the need for an intermediate source language transcription and timing (i. e. captions).

Translation

Dealing with training and test segmentation mismatch: FBK@IWSLT2021

no code implementations ACL (IWSLT) 2021 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Both knowledge distillation and the first fine-tuning step are carried out on manually segmented real and synthetic data, the latter being generated with an MT system trained on the available corpora.

Action Detection Activity Detection +4

Cascade versus Direct Speech Translation: Do the Differences Still Make a Difference?

no code implementations ACL 2021 Luisa Bentivogli, Mauro Cettolo, Marco Gaido, Alina Karakanta, Alberto Martinelli, Matteo Negri, Marco Turchi

Five years after the first published proofs of concept, direct approaches to speech translation (ST) are now competing with traditional cascade solutions.

Translation

How to Split: the Effect of Word Segmentation on Gender Bias in Speech Translation

1 code implementation Findings (ACL) 2021 Marco Gaido, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri, Marco Turchi

In light of this finding, we propose a combined approach that preserves BPE overall translation quality, while leveraging the higher ability of character-based segmentation to properly translate gender.

Segmentation Translation

Gender Bias in Machine Translation

1 code implementation13 Apr 2021 Beatrice Savoldi, Marco Gaido, Luisa Bentivogli, Matteo Negri, Marco Turchi

Machine translation (MT) technology has facilitated our daily tasks by providing accessible shortcuts for gathering, elaborating and communicating information.

Machine Translation Translation

Tutorial Proposal: End-to-End Speech Translation

no code implementations EACL 2021 Jan Niehues, Elizabeth Salesky, Marco Turchi, Matteo Negri

Speech translation is the translation of speech in one language typically to text in another, traditionally accomplished through a combination of automatic speech recognition and machine translation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Self-Learning for Zero Shot Neural Machine Translation

no code implementations10 Mar 2021 Surafel M. Lakew, Matteo Negri, Marco Turchi

Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource rich conditions.

Machine Translation NMT +2

The Multilingual TEDx Corpus for Speech Recognition and Translation

no code implementations2 Feb 2021 Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages.

speech-recognition Speech Recognition +1

CTC-based Compression for Direct Speech Translation

1 code implementation EACL 2021 Marco Gaido, Mauro Cettolo, Matteo Negri, Marco Turchi

Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST).

Translation

Breeding Gender-aware Direct Speech Translation Systems

no code implementations COLING 2020 Marco Gaido, Beatrice Savoldi, Luisa Bentivogli, Matteo Negri, Marco Turchi

In particular, by translating speech audio data without intermediate transcription, direct ST models are able to leverage and preserve essential information present in the input (e. g. speaker's vocal characteristics) that is otherwise lost in the cascade framework.

Machine Translation Translation

On Knowledge Distillation for Direct Speech Translation

1 code implementation9 Dec 2020 Marco Gaido, Mattia A. Di Gangi, Matteo Negri, Marco Turchi

Direct speech translation (ST) has shown to be a complex task requiring knowledge transfer from its sub-tasks: automatic speech recognition (ASR) and machine translation (MT).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

The Two Shades of Dubbing in Neural Machine Translation

no code implementations COLING 2020 Alina Karakanta, Supratik Bhattacharya, Shravan Nayak, Timo Baumann, Matteo Negri, Marco Turchi

Dubbing has two shades; synchronisation constraints are applied only when the actor{'}s mouth is visible on screen, while the translation is unconstrained for off-screen dubbing.

Machine Translation Translation +1

On Target Segmentation for Direct Speech Translation

no code implementations AMTA 2020 Mattia Antonino Di Gangi, Marco Gaido, Matteo Negri, Marco Turchi

Then, subword-level segmentation became the state of the art in neural machine translation as it produces shorter sequences that reduce the training time, while being superior to word-level models.

Data Augmentation Machine Translation +2

Contextualized Translation of Automatically Segmented Speech

1 code implementation5 Aug 2020 Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.

Segmentation Sentence +2

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

no code implementations WS 2020 Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.

Translation

End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

no code implementations WS 2020 Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation.

Data Augmentation Knowledge Distillation +3

Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

no code implementations WS 2020 Alina Karakanta, Matteo Negri, Marco Turchi

Subtitling is becoming increasingly important for disseminating information, given the enormous amounts of audiovisual content becoming available daily.

Machine Translation NMT +1

Low Resource Neural Machine Translation: A Benchmark for Five African Languages

1 code implementation31 Mar 2020 Surafel M. Lakew, Matteo Negri, Marco Turchi

Recent advents in Neural Machine Translation (NMT) have shown improvements in low-resource language (LRL) translation tasks.

Low-Resource Neural Machine Translation NMT +2

MuST-Cinema: a Speech-to-Subtitles corpus

no code implementations LREC 2020 Alina Karakanta, Matteo Negri, Marco Turchi

Growing needs in localising audiovisual content in multiple languages through subtitles call for the development of automatic solutions for human subtitling.

Machine Translation NMT +1

Instance-Based Model Adaptation For Direct Speech Translation

no code implementations23 Oct 2019 Mattia Antonino Di Gangi, Viet-Nhat Nguyen, Matteo Negri, Marco Turchi

Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora.

Domain Adaptation Speech-to-Text Translation +1

One-To-Many Multilingual End-to-end Speech Translation

no code implementations8 Oct 2019 Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

Multilingual solutions are widely studied in MT and usually rely on ``\textit{target forcing}'', in which multilingual parallel data are combined to train a single model by prepending to the input sequences a language token that specifies the target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Machine Translation for Machines: the Sentiment Classification Use Case

no code implementations IJCNLP 2019 Amirhossein Tebbifakhr, Luisa Bentivogli, Matteo Negri, Marco Turchi

Towards this objective, we present a reinforcement learning technique based on a new candidate sampling strategy, which exploits the results obtained on the downstream task as weak feedback.

Classification General Classification +7

Multilingual Neural Machine Translation for Zero-Resource Languages

1 code implementation16 Sep 2019 Surafel M. Lakew, Marcello Federico, Matteo Negri, Marco Turchi

In recent years, Neural Machine Translation (NMT) has been shown to be more effective than phrase-based statistical methods, thus quickly becoming the state of the art in machine translation (MT).

Machine Translation NMT +1

Effort-Aware Neural Automatic Post-Editing

no code implementations WS 2019 Amirhossein Tebbifakhr, Matteo Negri, Marco Turchi

For this purpose, following the common approach in multilingual NMT, we prepend a special token to the beginning of both the source text and the MT output indicating the required amount of post-editing.

Automatic Post-Editing NMT +1

Neural Text Simplification in Low-Resource Conditions Using Weak Supervision

no code implementations WS 2019 Alessio Palmero Aprosio, Sara Tonelli, Marco Turchi, Matteo Negri, Mattia A. Di Gangi

Inspired by the machine translation field, in which synthetic parallel pairs generated from monolingual data yield significant improvements to neural models, in this paper we exploit large amounts of heterogeneous data to automatically select simple sentences, which are then used to create synthetic simplification pairs.

Machine Translation Sentence +3

Improving Zero-Shot Translation of Low-Resource Languages

1 code implementation IWSLT 2017 Surafel M. Lakew, Quintino F. Lotito, Matteo Negri, Marco Turchi, Marcello Federico

Recent work on multilingual neural machine translation reported competitive performance with respect to bilingual models and surprisingly good performance even on (zeroshot) translation directions not observed at training time.

Machine Translation Translation

Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary

2 code implementations IWSLT (EMNLP) 2018 Surafel M. Lakew, Aliia Erofeeva, Matteo Negri, Marcello Federico, Marco Turchi

Our approach allows to extend an initial model for a given language pair to cover new languages by adapting its vocabulary as long as new data become available (i. e., introducing new vocabulary items if they are not included in the initial model).

Machine Translation NMT +2

Findings of the WMT 2018 Shared Task on Automatic Post-Editing

no code implementations WS 2018 Rajen Chatterjee, Matteo Negri, Raphael Rubino, Marco Turchi

In the former subtask, characterized by original translations of lower quality, top results achieved impressive improvements, up to -6. 24 TER and +9. 53 BLEU points over the baseline {``}\textit{do-nothing}{''} system.

Automatic Post-Editing NMT +1

Multi-source transformer with combined losses for automatic post editing

no code implementations WS 2018 Amirhossein Tebbifakhr, Ruchit Agrawal, Matteo Negri, Marco Turchi

In the first subtask, our system improves over the baseline up to -5. 3 TER and +8. 23 BLEU points ranking second out of 11 submitted runs.

Automatic Post-Editing NMT +2

eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

no code implementations LREC 2018 Matteo Negri, Marco Turchi, Rajen Chatterjee, Nicola Bertoldi

eSCAPE consists of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly-available parallel corpora, and using the target side as an artificial human post-edit.

Automatic Post-Editing Sentence

Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English

no code implementations31 Jul 2017 Duygu Ataman, Matteo Negri, Marco Turchi, Marcello Federico

In this paper, we propose a new vocabulary reduction method for NMT, which can reduce the vocabulary of a given input corpus at any rate while also considering the morphological properties of the language.

Machine Translation Morphological Analysis +2

Automatic Quality Estimation for ASR System Combination

no code implementations22 Jun 2017 Shahab Jalalvand, Matteo Negri, Daniele Falavigna, Marco Matassoni, Marco Turchi

In this paper we propose a novel variant of ROVER that takes advantage of ASR quality estimation (QE) for ranking the transcriptions at "segment level" instead of: i) relying on confidence scores, or ii) feeding ROVER with randomly ordered hypotheses.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

no code implementations EACL 2017 Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, Marco Turchi

Automatic post-editing (APE) for machine translation (MT) aims to fix recurrent errors made by the MT decoder by learning from correction examples.

Automatic Post-Editing Translation

Neural vs. Phrase-Based Machine Translation in a Multi-Domain Scenario

no code implementations EACL 2017 M. Amin Farajian, Marco Turchi, Matteo Negri, Nicola Bertoldi, Marcello Federico

State-of-the-art neural machine translation (NMT) systems are generally trained on specific domains by carefully selecting the training sets and applying proper domain adaptation techniques.

Domain Adaptation Machine Translation +2

DNN adaptation by automatic quality estimation of ASR hypotheses

no code implementations6 Feb 2017 Daniele Falavigna, Marco Matassoni, Shahab Jalalvand, Matteo Negri, Marco Turchi

Our hypothesis is that significant improvements can be achieved by: i)automatically transcribing the evaluation data we are currently trying to recognise, and ii) selecting from it a subset of "good quality" instances based on the word error rate (WER) scores predicted by a QE component.

Sentence

SentiWords: Deriving a High Precision and High Coverage Lexicon for Sentiment Analysis

no code implementations30 Oct 2015 Lorenzo Gatti, Marco Guerini, Marco Turchi

Using this technique we have built SentiWords, a prior polarity lexicon of approximately 155, 000 words, that has both a high precision and a high coverage.

Sentiment Analysis Vocal Bursts Intensity Prediction

An efficient and user-friendly tool for machine translation quality estimation

no code implementations LREC 2014 Kashif Shah, Marco Turchi, Lucia Specia

We present a new version of QUEST ― an open source framework for machine translation quality estimation ― which brings a number of improvements: (i) it provides a Web interface and functionalities such that non-expert users, e. g. translators or lay-users of machine translations, can get quality predictions (or internal features of the framework) for translations without having to install the toolkit, obtain resources or build prediction models; (ii) it significantly improves over the previous runtime performance by keeping resources (such as language models) in memory; (iii) it provides an option for users to submit the source text only and automatically obtain translations from Bing Translator; (iv) it provides a ranking of multiple translations submitted by users for each source text according to their estimated quality.

Machine Translation Translation

Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements

no code implementations LREC 2014 Marco Turchi, Matteo Negri

To overcome these issues, we present an automatic method for the annotation of (source, target) pairs with binary judgements that reflect an empirical, and easily interpretable notion of quality.

Machine Translation Re-Ranking +2

ONTS: "Optima" News Translation System

no code implementations EACL 2012 Marco Turchi, Martin Atkinson, Alastair Wilcox, Brett Crawley, Stefano Bucci, Ralf Steinberger, Erik van der Goot

We propose a real-time machine translation system that allows users to select a news category and to translate the related live news articles from Arabic, Czech, Danish, Farsi, French, German, Italian, Polish, Portuguese, Spanish and Turkish into English.

Machine Translation Translation

Sentiment Analysis: How to Derive Prior Polarities from SentiWordNet

no code implementations EMNLP 2013 Marco Guerini, Lorenzo Gatti, Marco Turchi

Assigning a positive or negative score to a word out of context (i. e. a word's prior polarity) is a challenging task for sentiment analysis.

General Classification Sentiment Analysis

JRC EuroVoc Indexer JEX - A freely available multi-label categorisation tool

no code implementations LREC 2012 Ralf Steinberger, Mohamed Ebrahim, Marco Turchi

EuroVoc (2012) is a highly multilingual thesaurus consisting of over 6, 700 hierarchically organised subject domains used by European Institutions and many authorities in Member States of the European Union (EU) for the classification and retrieval of official documents.

Classification Clustering +4

Cannot find the paper you are looking for? You can Submit a new open access paper.