Speech-to-Text Translation

49 papers with code • 10 benchmarks • 3 datasets

Translate audio signals of speech in one language into text in a foreign language, either in an end-to-end or cascade manner.

Benchmarks

Add a Result

These leaderboards are used to track progress in Speech-to-Text Translation

Dataset	Best Model	Compare
MuST-C EN->DE	Task Modulation + Multitask Learning(ASR/MT) + Data Augmentation	See all
MuST-C EN->ES	Transformer with Adapters	See all
MuST-C EN->FR	Dual-decoder Transformer	See all
libri-trans	Transformer + ASR Pretrain + SpecAug	See all
MuST-C	Transformer with Adapters	See all
FLEURS X-eng	SeamlessM4T Large	See all
FLEURS eng-X	SeamlessM4T Large	See all
CoVoST 2 X-eng	SeamlessM4T Large	See all
CoVoST 2 eng-X	SeamlessM4T Large	See all
MuST-C EN->NL	Speechformer	See all

Libraries

Use these libraries to find Speech-to-Text Translation models and implementations

formiel/fairseq

2 papers

Datasets

Subtasks

Simultaneous Speech-to-Text Translation

Most implemented papers

Most implemented Social Latest No code

fairseq S2T: Fast Speech-to-Text Modeling with fairseq

pytorch/fairseq • • Asian Chapter of the Association for Computational Linguistics 2020

We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation.

Paper
Code

CoVoST 2 and Massively Multilingual Speech-to-Text Translation

facebookresearch/covost • • 20 Jul 2020

Speech translation has recently become an increasingly popular topic of research, partly due to the development of benchmark datasets.

Paper
Code

Learning Shared Semantic Space for Speech-to-Text Translation

Glaciohound/Chimera-SLT • • Findings (ACL) 2021

By projecting audio and text features to a common semantic representation, Chimera unifies MT and ST tasks and boosts the performance on ST benchmarks, MuST-C and Augmented Librispeech, to a new state-of-the-art.

Paper
Code

Lightweight Adapter Tuning for Multilingual Speech Translation

formiel/fairseq • • ACL 2021

Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP.

Paper
Code

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

mt-upc/shas • • 9 Feb 2022

Speech translation datasets provide manual segmentations of the audios, which are not available in real-world scenarios, and existing segmentation methods usually significantly reduce translation quality at inference time.

Paper
Code

A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

PaddlePaddle/PaddleSpeech • • 18 Mar 2022

Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation.

Paper
Code

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddlePaddle/PaddleSpeech • • NAACL (ACL) 2022

PaddleSpeech is an open-source all-in-one speech toolkit.

Paper
Code

SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

facebookresearch/seamless_communication • • 22 Aug 2023

What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages?

Paper
Code

Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

eske/seq2seq • • 6 Dec 2016

This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language transcription during learning or decoding.

Paper
Code

Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation

alicank/Translation-Augmented-LibriSpeech-Corpus • LREC 2018

However, while large quantities of parallel texts (such as Europarl, OpenSubtitles) are available for training machine translation systems, there are no large (100h) and open source parallel corpora that include speech in a source language aligned to text in a target language.

Paper
Code

Speech-to-Text Translation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result