fairseq S2T: Fast Speech-to-Text Modeling with fairseq

11 Oct 2020 Changhan Wang Yun Tang Xutai Ma Anne Wu Dmytro Okhonko Juan Pino

We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation. It follows fairseq's careful design for scalability and extensibility... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Speech-to-Text Translation MuST-C EN->DE Transformer + ASR Pretrain Case-sensitive sacreBLEU 22.9 # 2

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet