SpeechMatrix

Introduced by Duquenne et al. in SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations

SpeechMatrix is a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings. It contains speech alignments in 136 language pairs with a total of 418 thousand hours of speech.

Source: SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

CSS10

Source: (https://scontent-lhr8-2.xx.fbcdn.net/v/t39.8562-6/310002966_605149234737289_5204270723809834290_n.pdf?_nc_cat=102&ccb=1-7&_nc_sid=ad8a9d&_nc_ohc=FN2KnupyKI0AX90B5UO&_nc_ht=scontent-lhr8-2.xx&oh=00_AT9iFWHchGOnkzVTmwiYIDElIXSnwilSGhDwRQdFh99rlA&oe=63560915.

SpeechMatrix

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

CSS10

Usage

License

Modalities

Languages

SpeechMatrix

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

CSS10

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages