Speech-to-Text Translation

50 papers with code • 10 benchmarks • 3 datasets

Translate audio signals of speech in one language into text in a foreign language, either in an end-to-end or cascade manner.

Libraries

Use these libraries to find Speech-to-Text Translation models and implementations
2 papers
17

Most implemented papers

End-to-End Automatic Speech Translation of Audiobooks

alicank/Translation-Augmented-LibriSpeech-Corpus 12 Feb 2018

We investigate end-to-end speech-to-text translation on a corpus of audiobooks specifically augmented for this task.

Pre-training on high-resource speech recognition improves low-resource speech-to-text translation

0xSameer/ast NAACL 2019

Finally, we show that the approach improves performance on a true low-resource task: pre-training on a combination of English ASR and French ASR improves Mboshi-French ST, where only 4 hours of data are available, from 3. 5 to 7. 1

Direct speech-to-speech translation with a sequence-to-sequence model

sam2125/translatotron 12 Apr 2019

We present an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding

kabongosalomon/listra 16 Dec 2019

Speech-to-text translation (ST), which translates source language speech into target language text, has attracted intensive attention in recent years.

FlexiBO: A Decoupled Cost-Aware Multi-Objective Optimization Approach for Deep Neural Networks

softsys4ai/FlexiBO 18 Jan 2020

FlexiBO weights the improvement of the hypervolume of the Pareto region by the measurement cost of each objective to balance the expense of collecting new information with the knowledge gained through objective evaluations, preventing us from performing expensive measurements for little to no gain.

CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus

facebookresearch/covost LREC 2020

Spoken language translation has recently witnessed a resurgence in popularity, thanks to the development of end-to-end models and the creation of new corpora, such as Augmented LibriSpeech and MuST-C.

Contextualized Translation of Automatically Segmented Speech

mgaido91/FBK-fairseq-ST 5 Aug 2020

We show that our context-aware solution is more robust to VAD-segmented input, outperforming a strong base model and the fine-tuning on different VAD segmentations of an English-German test set by up to 4. 25 BLEU points.

Consecutive Decoding for Speech-to-text Translation

dqqcasia/st 21 Sep 2020

The key idea is to generate source transcript and target translation text with a single decoder.

"Listen, Understand and Translate": Triple Supervision Decouples End-to-end Speech-to-text Translation

dqqcasia/st 21 Sep 2020

Can we build a system to fully utilize signals in a parallel ST corpus?

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

formiel/speech-translation COLING 2020

We propose two variants of these architectures corresponding to two different levels of dependencies between the decoders, called the parallel and cross dual-decoder Transformers, respectively.