SpeechMatrix is a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings. It contains speech alignments in 136 language pairs with a total of 418 thousand hours of speech.
Source: SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech TranslationsPaper | Code | Results | Date | Stars |
---|