Datasets

7,690 machine learning datasets
Filter by Task
Speech Recognition 126 Speaker Recognition 68 Accented Speech Recognition 23 Audio Classification 22 Music Information Retrieval 20 Sound Event Detection 18 Speech Synthesis 17 Music Generation 14 Automatic Speech Recognition 11 Dialect Identification 11 Information Retrieval 11 Acoustic Scene Classification 9 Speech Emotion Recognition 9 Speech Enhancement 9 Text-To-Speech Synthesis 9 Audio Source Separation 8 Data Augmentation 8 Multi-Task Learning 8 Scene Classification 8 Audio Tagging 7 Emotion Recognition 7 Language Modelling 7 Music Source Separation 7 Speech Separation 7 Environmental Sound Classification 6 Sound Event Localization and Detection 6 Few-Shot Audio Classification 5 Music Transcription 5 Spoken Language Understanding 5 Video Understanding 5 Action Recognition 4 Anomaly Detection 4 Audio Generation 4 Audio to Text Retrieval 4 Emotion Recognition in Conversation 4 Keyword Spotting 4 Language Identification 4 Lipreading 4 Multimodal Emotion Recognition 4 Music Modeling 4 Noise Estimation 4 Noise Level Prediction 4 Noisy Speech Recognition 4 Question Answering 4 Text to Audio Retrieval 4 Visual Speech Recognition 4 Action Quality Assessment 3 Direction of Arrival Estimation 3 Emotion Classification 3 Facial Expression Recognition 3 Genre classification 3 Lip Reading 3 Multi-Label Classification 3 Multi-task Audio Source Seperation 3 Music Classification 3 Quantization 3 Recommendation Systems 3 Slot Filling 3 Speaker Identification 3 Speaker Verification 3 Spoken language identification 3 Unconstrained Lip-synchronization 3 Video Retrieval 3 audio-visual learning 3 Abnormal Event Detection In Video 2 Acoustic echo cancellation 2 Activity Recognition 2 Audio Super-Resolution 2 Audio captioning 2 Audio-Visual Speech Recognition 2 Audio-Visual Synchronization 2 Bird Audio Detection 2 Contrastive Learning 2 DeepFake Detection 2 Distant Speech Recognition 2 Gesture Generation 2 Image Classification 2 Instrument Recognition 2 Intent Detection 2 Learning with noisy labels 2 Lip to Speech Synthesis 2 Multimodal Deep Learning 2 Multimodal Sentiment Analysis 2 Multiview Learning 2 Music Auto-Tagging 2 Music Tagging 2 Object Recognition 2 Open Intent Discovery 2 Open Set Learning 2 Resynthesis 2 Scene Understanding 2 Self-Supervised Learning 2 Semantic Segmentation 2 Skills Assessment 2 Skills Evaluation 2 Speaker Separation 2 Speech Denoising 2 Speech Dereverberation 2 Speech Extraction 2 Speech-to-Text Translation 2 Style Transfer 2 Supervised Video Summarization 2 Talking Face Generation 2 Talking Head Generation 2 Unsupervised Anomaly Detection 2 Unsupervised Video Summarization 2 Video Classification 2 Video Summarization 2 Visual Keyword Spotting 2 Zero-Shot Environment Sound Classification 2 Zero-Shot Learning 2 3D Point Cloud Reconstruction 1 Abstractive Text Summarization 1 Action Parsing 1 Action Recognition In Videos 1 Action Understanding 1 Active Learning 1 Active Speaker Localization 1 Activity Detection 1 Activity Prediction 1 Anomaly Detection In Surveillance Videos 1 Anxiety Detection 1 Arousal Estimation 1 Audio Effects Modeling 1 Audio Emotion Recognition 1 Audio Fingerprint 1 Audio Multiple Target Classification 1 Audio-visual Question Answering 1 Audio/Video to Text Retrieval 1 Automatic Phoneme Recognition 1 Automatic Sleep Stage Classification 1 Chord Recognition 1 Common Sense Reasoning 1 Conversational Response Generation 1 Cross-Lingual ASR 1 Cross-Lingual POS Tagging 1 Cross-Lingual Transfer 1 Cross-lingual zero-shot dependency parsing 1 Dense Video Captioning 1 Dependency Parsing 1 Depression Detection 1 Dialog Act Classification 1 Dialog Learning 1 Dialogue Act Classification 1 Dialogue Generation 1 Dimensionality Reduction 1 Directional Hearing 1 Domain Adaptation 1 Drum Transcription 1 ECG Classification 1 Emotional Dialogue Acts 1 Environment Sound Classification 1 Face Clustering 1 Face Detection 1 Face Recognition 1 Facial Emotion Recognition 1 Fact Checking 1 Federated Learning 1 Few-Shot Learning 1 Fill Mask 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Gait Recognition 1 Gunshot Detection 1 Highlight Detection 1 Human Pose Forecasting 1 Humor Detection 1 Image Captioning 1 Image Generation 1 Image Generation from Scene Graphs 1 Image Manipulation 1 Image-to-Text Retrieval 1 Intent Classification 1 Knowledge Graphs 1 Link Prediction 1 Low resource - Speech Emotion Recognition 1 Matrix Completion 1 Melody Extraction 1 Metric Learning 1 Mobile Security 1 Motion Synthesis 1 Multilingual NLP 1 Multimodal Abstractive Text Summarization 1 Multimodal Activity Recognition 1 Multimodal Sleep Stage Detection 1 Multiview Detection 1 Music Emotion Recognition 1 Music Genre Classification 1 Music Recommendation 1 Music Style Transfer 1 Named Entity Recognition 1 Natural Language Inference (Few-Shot) 1 Neural Architecture Search 1 Online Beat Tracking 1 Online Downbeat Tracking 1 Open-Domain Dialog 1 Opinion Mining 1 Optical Flow Estimation 1 Part-Of-Speech Tagging 1 Personality Recognition in Conversation 1 Personality Trait Recognition 1 Personalized and Emotional Conversation 1 Phone-level pronunciation scoring 1 Physical Commonsense Reasoning 1 Pose Estimation 1 Prediction Intervals 1 Real-time Directional Hearing 1 Retrieval 1 Robust Speech Recognition 1 SQL Parsing 1 Sarcasm Detection 1 Scene Graph Detection 1 Scene Recognition 1 Scene-Aware Dialogue 1 Seizure Detection 1 Self-Driving Cars 1 Self-Supervised Audio Classification 1 Semantic Parsing 1 Sentence Embedding 1 Sentiment Analysis 1 Sequential skip prediction 1 Shooter Localization 1 Singer Identification 1 Sleep Stage Detection 1 Speaker Diarization 1 Speech Emotion Recognition - 5-Fold 1 Speech Emotion Recognition in Russian 1 Speech Intent Classification 1 Speech Synthesis - Assamese 1 Speech Synthesis - Bengali 1 Speech Synthesis - Bodo 1 Speech Synthesis - Gujarati 1 Speech Synthesis - Hindi 1 Speech Synthesis - Kannada 1 Speech Synthesis - Malayalam 1 Speech Synthesis - Manipuri 1 Speech Synthesis - Marathi 1 Speech Synthesis - Odia 1 Speech Synthesis - Rajasthani 1 Speech Synthesis - Tamil 1 Speech Synthesis - Telugu 1 Speech-to-Gesture Translation 1 Speech-to-Speech Translation 1 Task-Oriented Dialogue Systems 1 Temporal Forgery Localization 1 Text Summarization 1 Text to Audio/Video Retrieval 1 Text-to-Image Retrieval 1 Time Offset Calibration 1 Time Series 1 Time Series Alignment 1 Time Series Analysis 1 Time Series Averaging 1 Time Series Classification 1 Time Series Clustering 1 Token Classification 1 Translation 1 Valence Estimation 1 Video Captioning 1 Video Emotion Recognition 1 Video Recognition 1 Video Reconstruction 1 Video Synchronization 1 Video-Text Retrieval 1 Visual Question Answering 1 Voice Query Recognition 1 Wikipedia Summarization 1 Word Embeddings 1 Zero-Shot Video Retrieval 1 speech-recognition 1
Filter by Language
English 125 Chinese 74 German 20 Spanish 17 French 16 Italian 14 Japanese 14 Hindi 10 Russian 10 Korean 9 Arabic 8 Portuguese 8 Dutch 6 Indonesian 6 Persian 6 Tamil 6 Turkish 5 Catalan 4 Estonian 4 Latvian 4 Malayalam 4 Mongolian 4 Slovenian 4 Swedish 4 Thai 4 Vietnamese 4 Welsh 4 Czech 3 Greek 3 Mandarin Chinese 3 Polish 3 Romanian 3 Ukrainian 3 Assamese 2 Basque 2 Bengali 2 Breton 2 Bulgarian 2 Finnish 2 Fon 2 Hungarian 2 Irish 2 Kazakh 2 Lithuanian 2 Maltese 2 Marathi 2 Multilingual 2 Odia 2 Punjabi 2 Slovak 2 Telugu 2 Afrikaans 1 Akkadian 1 Akuntsu 1 Albanian 1 Amharic 1 Ancient Greek 1 Apurinã 1 Armenian 1 Assyrian Neo-Aramaic 1 Bambara 1 Belarusian 1 Bhojpuri 1 Bodo (India) 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Croatian 1 Danish 1 Dhivehi 1 Erzya 1 Esperanto 1 Faroese 1 Filipino 1 Galician 1 Georgian 1 Gothic 1 Gujarati 1 Hakha Chin 1 Hebrew 1 Icelandic 1 Kabyle 1 Kannada 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi-Permyak 1 Komi-Zyrian 1 Latin 1 Literary Chinese 1 Livvi 1 Manipuri 1 Manx 1 Mbyá Guaraní 1 Modern Greek 1 Moksha 1 Mundurukú 1 Nayini 1 Nigerian Pidgin 1 Northern Kurdish 1 Northern Sami 1 Norwegian 1 Old French 1 Old Russian 1 Old Turkish 1 Portuguse 1 Rajasthani 1 Russia Buriat 1 Sanskrit 1 Scottish Gaelic 1 Serbian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swedish Sign Language 1 Swiss German 1 Tagalog 1 Tatar 1 Tupinambá 1 Uighur 1 Upper Sorbian 1 Urdu 1 Uzbek 1 Votic 1 Warlpiri 1 Wolof 1 Yoruba 1 Yue Chinese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 American Sign Language 0 Ancient Hebrew 0 Aragonese 0 Argentine Sign Language 0 Arpitan 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Bavarian 0 Bishnupriya 0 Bislama 0 Bosnian 0 Buginese 0 Burmese 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Congo Swahili 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dimli (individual language) 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Friulian 0 Fulah 0 Gagauz 0 Gan Chinese 0 Ganda 0 Geez 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Greek Sign Language 0 Guarani 0 Gulf Arabic 0 Haitian 0 Hakka Chinese 0 Hausa 0 Hawaiian 0 Herero 0 Hiri Motu 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kalaallisut 0 Kalmyk 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kirghiz 0 Komi 0 Kongo 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Maori 0 Marshallese 0 Mazanderani 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek (1453-) 0 Moroccan Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Luri 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Nyanja 0 Occitan (post 1500) 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Pontic 0 Pushto 0 Quechua 0 Romansh 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Serbo-Croatian 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Somali 0 South Azerbaijani 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tajik 0 Tetum 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Venda 0 Venetian 0 Veps 0 Vlaams 0 Vlax Romani 0 Volapük 0 Walloon 0 Waray (Philippines) 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

458 dataset results for Audio