The People's Speech is a free-to-download 30,000-hour and growing supervised conversational English speech recognition dataset licensed for academic and commercial usage under CC-BY-SA (with a CC-BY subset The data is collected via searching the Internet for appropriately licensed audio data with existing transcriptions.
5 PAPERS • NO BENCHMARKS YET
…This corpus has several outstanding characteristics: hundreds of hours of aligned audio from a diverse set of readers about a diverse set of topics in a well-researched textual genre licensed under a free license (CC BY-SA 4.0) Annotations can be mapped back to the original html phoneme-level alignments
1 PAPER • 1 BENCHMARK
…localized narratives (synchronized voice, mouse trace, and text caption) 66,391,027 point-level annotations on 5,827 classes 61,404,966 image-level labels on 20,638 classes Images are under a CC BY 2.0 license , annotations under CC BY 4.0 license.
4 PAPERS • NO BENCHMARKS YET
…Corpus includes: SLUE-VoxPopuli: consists of ASR and NER tasks - CC0 license SLUE-VoxCeleb: consists of ASR and SA tasks - CCBY 4.0 license
20 PAPERS • 3 BENCHMARKS
…The podcast audio recordings, sourced from SoundCloud, are CC-licensed, gender-balanced, and total 145 hours of audio from over 350 speakers. The annotations are provided under a non-commercial license and consist of 85,803 manually annotated audio events including approximately 35,000 filler words (“uh” and “um”) and 50,000 non-filler events
3 PAPERS • 1 BENCHMARK
…Please contact us(contact@deeplyinc.com) for the full set with the research/commercial license.
0 PAPER • NO BENCHMARKS YET
…Please see the individual readme and license files in each of the dataset folders within the NISQA_Corpus.zip for more details about the datasets and the licenses.
1 PAPER • NO BENCHMARKS YET
…The corpus is available for free under the Creative Commons license (CC BY-NC-SA 4.0). The dataset is available on Hugging Face and GitHub. Data Fields file_id - filename, i.e. Additional Information Licensing Information The dataset is available under the Creative Commons license (CC BY-NC-SA 4.0). Citation Information You can access the nEMO paper at arXiv.
…The Google Speech Commands v2 dataset is under the Creative Commons BY 4.0 license.
2 PAPERS • 1 BENCHMARK
…Ltd under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License.