12 dataset results for license plates AND Speech

The People's Speech is a free-to-download 30,000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset The data is collected via searching the Internet for appropriately licensed audio data with existing transcriptions.

5 PAPERS • NO BENCHMARKS YET

The Spoken Wikipedia Corpora

…This corpus has several outstanding characteristics: hundreds of hours of aligned audio from a diverse set of readers about a diverse set of topics in a well-researched textual genre licensed under a free license (CC BY-SA 4.0) Annotations can be mapped back to the original html phoneme-level alignments

1 PAPER • 1 BENCHMARK

Open Images V7

…localized narratives (synchronized voice, mouse trace, and text caption) 66,391,027 point-level annotations on 5,827 classes 61,404,966 image-level labels on 20,638 classes Images are under a CC BY 2.0 license , annotations under CC BY 4.0 license.

4 PAPERS • NO BENCHMARKS YET

SLUE

SLUE (Spoken Language Understanding Evaluation)

…Corpus includes: SLUE-VoxPopuli: consists of ASR and NER tasks - CC0 license SLUE-VoxCeleb: consists of ASR and SA tasks - CCBY 4.0 license

20 PAPERS • 3 BENCHMARKS

PodcastFillers

…The podcast audio recordings, sourced from SoundCloud, are CC-licensed, gender-balanced, and total 145 hours of audio from over 350 speakers. The annotations are provided under a non-commercial license and consist of 85,803 manually annotated audio events including approximately 35,000 filler words (“uh” and “um”) and 50,000 non-filler events

3 PAPERS • 1 BENCHMARK

Deeply vocal characterizer

…Please contact us(contact@deeplyinc.com) for the full set with the research/commercial license.

0 PAPER • NO BENCHMARKS YET

NISQA Speech Quality Corpus

…Please see the individual readme and license files in each of the dataset folders within the NISQA_Corpus.zip for more details about the datasets and the licenses.

1 PAPER • NO BENCHMARKS YET

nEMO

…The corpus is available for free under the Creative Commons license (CC BY-NC-SA 4.0). The dataset is available on Hugging Face and GitHub. Data Fields file_id - filename, i.e. Additional Information Licensing Information The dataset is available under the Creative Commons license (CC BY-NC-SA 4.0). Citation Information You can access the nEMO paper at arXiv.

1 PAPER • NO BENCHMARKS YET

Deeply Korean read speech

…Please contact us(contact@deeplyinc.com) for the full set with the research/commercial license.

0 PAPER • NO BENCHMARKS YET

Deeply Parent-Child vocal interaction

…Please contact us(contact@deeplyinc.com) for the full set with the research/commercial license.

0 PAPER • NO BENCHMARKS YET

Google Speech Commands - Musan

…The Google Speech Commands v2 dataset is under the Creative Commons BY 4.0 license.

2 PAPERS • 1 BENCHMARK

aidatatang_200zh

…Ltd under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License.

0 PAPER • NO BENCHMARKS YET

Datasets

12 dataset results for license plates AND Speech