Search Results for author: Mattia Antonino Di Gangi

Found 15 papers, 4 papers with code

Paper
Add Code

Monolingual Embeddings for Low Resourced Neural Machine Translation

1 code implementation • IWSLT 2017 • Mattia Antonino Di Gangi, Marcello Federico

When only little data exist for a language pair, the model cannot produce good representations for words, particularly for rare words.

Machine Translation NMT +2

Paper
Code

On Target Segmentation for Direct Speech Translation

no code implementations • AMTA 2020 • Mattia Antonino Di Gangi, Marco Gaido, Matteo Negri, Marco Turchi

Then, subword-level segmentation became the state of the art in neural machine translation as it produces shorter sequences that reduce the training time, while being superior to word-level models.

Data Augmentation Machine Translation +2

Paper
Add Code

Contextualized Translation of Automatically Segmented Speech

1 code implementation • 5 Aug 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Mauro Cettolo, Marco Turchi

We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.

Segmentation Sentence +2

Paper
Code

Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus

no code implementations • ACL 2020 • Luisa Bentivogli, Beatrice Savoldi, Matteo Negri, Mattia Antonino Di Gangi, Roldano Cattoni, Marco Turchi

Translating from languages without productive grammatical gender like English into gender-marked languages is a well-known difficulty for machines.

Machine Translation Sentence +1

Paper
Add Code

End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

no code implementations • WS 2020 • Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

The test talks are provided in two versions: one contains the data already segmented with automatic tools and the other is the raw data without any segmentation.

Data Augmentation Knowledge Distillation +3

Paper
Add Code

Instance-Based Model Adaptation For Direct Speech Translation

no code implementations • 23 Oct 2019 • Mattia Antonino Di Gangi, Viet-Nhat Nguyen, Matteo Negri, Marco Turchi

Despite recent technology advancements, the effectiveness of neural approaches to end-to-end speech-to-text translation is still limited by the paucity of publicly available training corpora.

Domain Adaptation Speech-to-Text Translation +1

Paper
Add Code

Robust Neural Machine Translation for Clean and Noisy Speech Transcripts

no code implementations • EMNLP (IWSLT) 2019 • Mattia Antonino Di Gangi, Robert Enyedi, Alessandra Brusadin, Marcello Federico

Our experimental results on a public speech translation data set show that adapting a model on a significant amount of parallel data including ASR transcripts is beneficial with test data of the same type, but produces a small degradation when translating clean text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

One-To-Many Multilingual End-to-end Speech Translation

no code implementations • 8 Oct 2019 • Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

Multilingual solutions are widely studied in MT and usually rely on ``\textit{target forcing}'', in which multilingual parallel data are combined to train a single model by prepending to the input sequences a language token that specifies the target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Enhancing Transformer for End-to-end Speech-to-Text Translation

no code implementations • WS 2019 • Mattia Antonino Di Gangi, Matteo Negri, Roldano Cattoni, Roberto Dessi, Marco Turchi

Speech-to-Text Translation Translation

Paper
Add Code

Assessing the Tolerance of Neural Machine Translation Systems Against Speech Recognition Errors

no code implementations • 24 Apr 2019 • Nicholas Ruiz, Mattia Antonino Di Gangi, Nicola Bertoldi, Marcello Federico

Machine translation systems are conventionally trained on textual resources that do not model phenomena that occur in spoken language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Effectiveness of Data-Driven Induction of Semantic Spaces and Traditional Classifiers for Sarcasm Detection

1 code implementation • 2 Apr 2019 • Mattia Antonino Di Gangi, Giosué Lo Bosco, Giovanni Pilato

Irony and sarcasm are two complex linguistic phenomena that are widely used in everyday language and especially over the social media, but they represent two serious issues for automated text understanding.

Sarcasm Detection