Search Results for author: Sara Papi

Found 26 papers, 18 papers with code

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

1 code implementation1 Oct 2024 Marco Gaido, Sara Papi, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih, Matteo Negri

The rise of foundation models (FMs), coupled with regulatory efforts addressing their risks and impacts, has sparked significant interest in open-source models.

SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation

1 code implementation20 Jun 2024 Sara Papi, Marco Gaido, Matteo Negri, Luisa Bentivogli

This paper describes the FBK's participation in the Simultaneous Translation Evaluation Campaign at IWSLT 2024.

Speech-to-Text Translation Translation

StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection

1 code implementation10 Jun 2024 Sara Papi, Marco Gaido, Matteo Negri, Luisa Bentivogli

To fill this gap, we introduce StreamAtt, the first StreamST policy, and propose StreamLAAL, the first StreamST latency metric designed to be comparable with existing metrics for SimulST.

Speech-to-Text Translation Translation

SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

2 code implementations17 May 2024 Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli

Subtitling plays a crucial role in enhancing the accessibility of audiovisual content and encompasses three primary subtasks: translating spoken dialogue, segmenting translations into concise textual units, and estimating timestamps that govern their on-screen duration.

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

1 code implementation20 Feb 2024 Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

The attention mechanism, a cornerstone of state-of-the-art neural models, faces computational hurdles in processing long sequences due to its quadratic complexity.

Automatic Speech Recognition Image Classification +3

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

no code implementations19 Feb 2024 Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

The field of natural language processing (NLP) has recently witnessed a transformative shift with the emergence of foundation models, particularly Large Language Models (LLMs) that have revolutionized text-based NLP.

Speech-to-Text Translation

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

1 code implementation24 Oct 2023 Dennis Fucci, Marco Gaido, Sara Papi, Mauro Cettolo, Matteo Negri, Luisa Bentivogli

When translating words referring to the speaker, speech translation (ST) systems should not resort to default masculine generics nor rely on potentially misleading vocal traits.

Decoder Language Modelling

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

no code implementations23 Oct 2023 Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Naoyuki Kanda, Jinyu Li, Yashesh Gaur

The growing need for instant spoken language transcription and translation is driven by increased global communication and cross-lingual interactions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Direct Models for Simultaneous Translation and Automatic Subtitling: FBK@IWSLT2023

1 code implementation27 Sep 2023 Sara Papi, Marco Gaido, Matteo Negri

This paper describes the FBK's participation in the Simultaneous Translation and Automatic Subtitling tracks of the IWSLT 2023 Evaluation Campaign.

Translation

Token-Level Serialized Output Training for Joint Streaming ASR and ST Leveraging Textual Alignments

no code implementations7 Jul 2023 Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Jinyu Li, Yashesh Gaur

In real-world applications, users often require both translations and transcriptions of speech to enhance their comprehension, particularly in streaming scenarios where incremental generation is necessary.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation

2 code implementations19 May 2023 Sara Papi, Marco Turchi, Matteo Negri

Attention is the core mechanism of today's most used architectures for natural language processing and has been analyzed from many perspectives, including its effectiveness for machine translation-related tasks.

Machine Translation Translation +1

When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP

2 code implementations28 Mar 2023 Sara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri

Despite its crucial role in research experiments, code correctness is often presumed only on the basis of the perceived quality of results.

Automatic Speech Recognition speech-recognition +1

Attention as a Guide for Simultaneous Speech Translation

2 code implementations15 Dec 2022 Sara Papi, Matteo Negri, Marco Turchi

The study of the attention mechanism has sparked interest in many fields, such as language modeling and machine translation.

Decoder Language Modelling +2

Joint Speech Translation and Named Entity Recognition

1 code implementation21 Oct 2022 Marco Gaido, Sara Papi, Matteo Negri, Marco Turchi

Modern automatic translation systems aim at place the human at the center by providing contextual support and knowledge.

Computational Efficiency Entity Linking +4

Direct Speech Translation for Automatic Subtitling

1 code implementation27 Sep 2022 Sara Papi, Marco Gaido, Alina Karakanta, Mauro Cettolo, Matteo Negri, Marco Turchi

Automatic subtitling is the task of automatically translating the speech of audiovisual content into short pieces of timed text, i. e. subtitles and their corresponding timestamps.

Translation

Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora

1 code implementation21 Sep 2022 Sara Papi, Alina Karakanta, Matteo Negri, Marco Turchi

Speech translation for subtitling (SubST) is the task of automatically translating speech data into well-formed subtitles by inserting subtitle breaks compliant to specific displaying guidelines.

Translation

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

1 code implementation NAACL (AutoSimTrans) 2022 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency, which is normally computed in terms of Average Lagging (AL).

Translation

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

1 code implementation IWSLT (ACL) 2022 Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri, Marco Turchi

The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality.

Sentence Translation

Does Simultaneous Speech Translation need Simultaneous Models?

1 code implementation8 Apr 2022 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

In simultaneous speech translation (SimulST), finding the best trade-off between high translation quality and low latency is a challenging task.

Translation

Visualization: the missing factor in Simultaneous Speech Translation

no code implementations31 Oct 2021 Sara Papi, Matteo Negri, Marco Turchi

Simultaneous speech translation (SimulST) is the task in which output generation has to be performed on partial, incremental speech input.

Translation

Speechformer: Reducing Information Loss in Direct Speech Translation

1 code implementation EMNLP 2021 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Transformer-based models have gained increasing popularity achieving state-of-the-art performance in many research fields including speech translation.

Speech-to-Text Translation Translation

Simultaneous Speech Translation for Live Subtitling: from Delay to Display

1 code implementation MTSummit 2021 Alina Karakanta, Sara Papi, Matteo Negri, Marco Turchi

Experiments on three language pairs (en$\rightarrow$it, de, fr) show that scrolling lines is the only mode achieving an acceptable reading speed while keeping delay close to a 4-second threshold.

Translation

Mixtures of Deep Neural Experts for Automated Speech Scoring

no code implementations23 Jun 2021 Sara Papi, Edmondo Trentin, Roberto Gretter, Marco Matassoni, Daniele Falavigna

The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Dealing with training and test segmentation mismatch: FBK@IWSLT2021

no code implementations ACL (IWSLT) 2021 Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Both knowledge distillation and the first fine-tuning step are carried out on manually segmented real and synthetic data, the latter being generated with an MT system trained on the available corpora.

Action Detection Activity Detection +4

Cannot find the paper you are looking for? You can Submit a new open access paper.