Datasets

9,299 machine learning datasets
Filter by Task
Speech Recognition 28 Audio Classification 26 Music Information Retrieval 21 Sound Event Detection 18 Music Generation 16 Automatic Speech Recognition (ASR) 14 Information Retrieval 11 Speech Emotion Recognition 11 Speech Enhancement 10 Acoustic Scene Classification 9 Few-Shot Audio Classification 9 Audio Source Separation 8 Audio Tagging 8 Data Augmentation 8 Emotion Recognition 8 Multi-Task Learning 8 Scene Classification 8 Speech Separation 8 Text-To-Speech Synthesis 8 Automatic Speech Recognition 7 Language Modelling 7 Music Source Separation 7 Environmental Sound Classification 6 Music Transcription 6 Sound Event Localization and Detection 6 Speech Synthesis 6 Action Recognition 5 Audio Generation 5 Audio-Visual Speech Recognition 5 Multimodal Emotion Recognition 5 Spoken Language Understanding 5 Video Understanding 5 Anomaly Detection 4 Audio to Text Retrieval 4 Emotion Classification 4 Emotion Recognition in Conversation 4 Keyword Spotting 4 Lipreading 4 Music Modeling 4 Question Answering 4 Text to Audio Retrieval 4 Video Retrieval 4 Visual Speech Recognition 4 Action Quality Assessment 3 Direction of Arrival Estimation 3 Distant Speech Recognition 3 Facial Expression Recognition (FER) 3 Genre classification 3 Gesture Generation 3 Instrument Recognition 3 Intent Detection 3 Language Identification 3 Lip Reading 3 Multi-Label Classification 3 Multi-task Audio Source Seperation 3 Multimodal Sentiment Analysis 3 Music Classification 3 Object Recognition 3 Quantization 3 Recommendation Systems 3 Self-Supervised Learning 3 Semantic Segmentation 3 Slot Filling 3 Speaker Diarization 3 Speaker Recognition 3 Speaker Verification 3 Speech Extraction 3 Spoken language identification 3 Talking Face Generation 3 Unconstrained Lip-synchronization 3 Video Emotion Recognition 3 Voice Conversion 3 3D Object Classification 2 Abstractive Text Summarization 2 Acoustic echo cancellation 2 Activity Recognition 2 Arousal Estimation 2 Audio Emotion Recognition 2 Audio Super-Resolution 2 Audio captioning 2 Audio-Visual Synchronization 2 Automatic Phoneme Recognition 2 Bird Audio Detection 2 Classification 2 Contrastive Learning 2 Cross-Modal Retrieval 2 DeepFake Detection 2 Depression Detection 2 Facial Emotion Recognition 2 Image Classification 2 Intent Discovery 2 Lip to Speech Synthesis 2 Multi-modal Classification 2 Multimodal Deep Learning 2 Multiview Learning 2 Music Auto-Tagging 2 Music Recommendation 2 Music Tagging 2 Online Beat Tracking 2 Online Downbeat Tracking 2 Open Intent Discovery 2 Open Set Learning 2 Out of Distribution (OOD) Detection 2 Resynthesis 2 Robust Speech Recognition 2 Scene Understanding 2 Skills Assessment 2 Skills Evaluation 2 Speaker Identification 2 Speaker Separation 2 Speech Denoising 2 Speech Dereverberation 2 Speech-to-Text Translation 2 Style Transfer 2 Talking Head Generation 2 Text-to-Music Generation 2 Unsupervised Anomaly Detection 2 Valence Estimation 2 Video Captioning 2 Video Classification 2 Video Summarization 2 Visual Keyword Spotting 2 Visual Question Answering (VQA) 2 Voice Cloning 2 Zero-Shot Learning 2 Zero-shot Audio Captioning 2 Zero-shot Audio Classification 2 Zero-shot Text to Audio Retrieval 2 audio-visual learning 2 2D Object Detection 1 3D Face Animation 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Human Reconstruction 1 3D Object Detection 1 3D Object Recognition 1 3D Point Cloud Reconstruction 1 Accented Speech Recognition 1 Action Parsing 1 Action Recognition In Videos 1 Action Understanding 1 Active Learning 1 Active Speaker Localization 1 Activity Detection 1 Activity Prediction 1 Anomaly Detection In Surveillance Videos 1 Anxiety Detection 1 Audio Effects Modeling 1 Audio Fingerprint 1 Audio Multiple Target Classification 1 Audio Signal Processing 1 Audio-visual Question Answering 1 Audio/Video to Text Retrieval 1 Automatic Lyrics Transcription 1 Automatic Sleep Stage Classification 1 Bandwidth Extension 1 Cadenza 1 - Task 1 - Headphone 1 Cadenza 1 - Task 2 - In Car 1 Chord Recognition 1 Common Sense Reasoning 1 Conversational Response Generation 1 Cross-Lingual ASR 1 Cross-Lingual POS Tagging 1 Cross-Lingual Transfer 1 Cross-lingual zero-shot dependency parsing 1 Dense Video Captioning 1 Dependency Parsing 1 Dialog Act Classification 1 Dialogue Act Classification 1 Dialogue Evaluation 1 Dialogue Generation 1 Dimensionality Reduction 1 Directional Hearing 1 Domain Adaptation 1 Dominance Estimation 1 Drum Transcription 1 ECG Classification 1 Emotional Dialogue Acts 1 Environment Sound Classification 1 Face Clustering 1 Face Detection 1 Fact Checking 1 Federated Learning 1 Few-Shot Learning 1 Fill Mask 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Gait Recognition 1 Gunshot Detection 1 Human Interaction Recognition 1 Human Pose Forecasting 1 Humor Detection 1 Image Captioning 1 Image Generation 1 Image Generation from Scene Graphs 1 Image Manipulation 1 Image Retrieval 1 Image-to-Text Retrieval 1 Intent Classification 1 Knowledge Graphs 1 LABELED_DEPENDENCIES 1 LEMMA 1 Learning with noisy labels 1 Link Prediction 1 MORPH 1 Matrix Completion 1 Meeting Summarization 1 Melody Extraction 1 Metric Learning 1 Mobile Security 1 Motion Synthesis 1 Multi-Source Unsupervised Domain Adaptation 1 Multimodal Abstractive Text Summarization 1 Multimodal Activity Recognition 1 Multimodal Reasoning 1 Multimodal Sleep Stage Detection 1 Multiview Detection 1 Music Captioning 1 Music Emotion Recognition 1 Music Genre Classification 1 Music Performance Rendering 1 Music Style Transfer 1 Named Entity Recognition (NER) 1 Natural Language Inference (Few-Shot) 1 Neural Architecture Search 1 Object Categorization 1 Occluded Face Detection 1 Open-Domain Dialog 1 Opinion Mining 1 Optical Flow Estimation 1 POS 1 Part-Of-Speech Tagging 1 Personality Recognition in Conversation 1 Personality Trait Recognition 1 Personalized and Emotional Conversation 1 Physical Commonsense Reasoning 1 Pitch Classification 1 Pose Estimation 1 Prediction Intervals 1 Question Generation 1 Real-time Directional Hearing 1 Retrieval 1 Robot Manipulation 1 SENTS 1 SQL Parsing 1 Sarcasm Detection 1 Scene Graph Detection 1 Scene Recognition 1 Scene-Aware Dialogue 1 Seizure Detection 1 Self-Driving Cars 1 Self-Supervised Audio Classification 1 Semantic Parsing 1 Sentence Embedding 1 Sentiment Analysis 1 Sequential skip prediction 1 Shooter Localization 1 Singer Identification 1 Sleep Stage Detection 1 Sound Classification 1 Speech Intent Classification 1 Speech Synthesis - Assamese 1 Speech Synthesis - Bengali 1 Speech Synthesis - Bodo 1 Speech Synthesis - Gujarati 1 Speech Synthesis - Hindi 1 Speech Synthesis - Kannada 1 Speech Synthesis - Malayalam 1 Speech Synthesis - Manipuri 1 Speech Synthesis - Marathi 1 Speech Synthesis - Odia 1 Speech Synthesis - Rajasthani 1 Speech Synthesis - Tamil 1 Speech Synthesis - Telugu 1 Speech-to-Gesture Translation 1 Speech-to-Speech Translation 1 Supervised Video Summarization 1 Synthetic Speech Detection 1 TAG 1 Task-Oriented Dialogue Systems 1 Temporal Forgery Localization 1 Text Classification 1 Text Generation 1 Text Summarization 1 Text to Audio/Video Retrieval 1 Time Offset Calibration 1 Time Series Alignment 1 Time Series Analysis 1 Time Series Averaging 1 Time Series Classification 1 Time Series Clustering 1 Token Classification 1 Translation 1 UNLABELED_DEPENDENCIES 1 Unsupervised Video Summarization 1 Urdu Speech Recognition 1 Video Emotion Detection 1 Video Object Segmentation 1 Video Reconstruction 1 Video Synchronization 1 Video-Text Retrieval 1 Voice Anti-spoofing 1 Voice Query Recognition 1 Wikipedia Summarization 1 Word Embeddings 1 Word Translation 1 Zero-Shot Environment Sound Classification 1 Zero-Shot Video Question Answer 1 Zero-Shot Video Retrieval 1 audio-visual event localization 1 speech-recognition 1 video narration captioning 1
Filter by Language
English 117 German 19 Chinese 17 French 14 Spanish 13 Russian 11 Italian 10 Japanese 9 Portuguese 9 Arabic 7 Hindi 7 Dutch 6 Persian 6 Tamil 6 Turkish 5 Catalan 4 Estonian 4 Indonesian 4 Latvian 4 Slovenian 4 Swedish 4 Welsh 4 Czech 3 Greek 3 Korean 3 Mongolian 3 Odia 3 Polish 3 Romanian 3 Ukrainian 3 Vietnamese 3 Assamese 2 Basque 2 Bengali 2 Breton 2 Bulgarian 2 Finnish 2 Fon 2 Hungarian 2 Irish 2 Kazakh 2 Lithuanian 2 Malayalam 2 Maltese 2 Mandarin Chinese 2 Marathi 2 Multilingual 2 Punjabi 2 Slovak 2 Telugu 2 Thai 2 Afrikaans 1 Akkadian 1 Akuntsu 1 Albanian 1 Amharic 1 Ancient Greek 1 Apurinã 1 Armenian 1 Assyrian Neo-Aramaic 1 Bambara 1 Belarusian 1 Bemba (Zambia) 1 Bhojpuri 1 Bodo (India) 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Croatian 1 Danish 1 Dhivehi 1 Erzya 1 Esperanto 1 Faroese 1 Galician 1 Georgian 1 Gothic 1 Gujarati 1 Hakha Chin 1 Hebrew 1 Icelandic 1 Kabyle 1 Kannada 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi-Permyak 1 Komi-Zyrian 1 Latin 1 Literary Chinese 1 Livvi 1 Lozi 1 Lunda 1 Manipuri 1 Manx 1 Mbyá Guaraní 1 Modern Greek 1 Moksha 1 Mundurukú 1 Nayini 1 Nigerian Pidgin 1 Northern Kurdish 1 Northern Sami 1 Norwegian 1 Nyanja 1 Old French 1 Old Russian 1 Old Turkish 1 Quechua 1 Rajasthani 1 Russia Buriat 1 Sanskrit 1 Scottish Gaelic 1 Serbian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swedish Sign Language 1 Swiss German 1 Tagalog 1 Tatar 1 Tonga (Zambia) 1 Tupinambá 1 Uighur 1 Upper Sorbian 1 Urdu 1 Uzbek 1 Votic 1 Warlpiri 1 Wolof 1 Yoruba 1 Yue Chinese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 American Sign Language 0 Ancient Hebrew 0 Aragonese 0 Argentine Sign Language 0 Arpitan 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Bavarian 0 Bishnupriya 0 Bislama 0 Bosnian 0 Buginese 0 Burmese 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Congo Swahili 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Filipino 0 Friulian 0 Fulah 0 Gagauz 0 Gan Chinese 0 Ganda 0 Geez 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Greek Sign Language 0 Guarani 0 Gulf Arabic 0 Haitian 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Herero 0 Hiri Motu 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kirghiz 0 Komi 0 Kongo 0 Krio 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Maori 0 Marshallese 0 Mazanderani 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek (1453-) 0 Moroccan Arabic 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Luri 0 Northern Uzbek 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Occitan (post 1500) 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Pontic 0 Portuguse 0 Pushto 0 Romansh 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Somali 0 South Azerbaijani 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tajik 0 Tetum 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Venda 0 Venetian 0 Veps 0 Vlaams 0 Vlax Romani 0 Volapük 0 Walloon 0 Waray (Philippines) 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

379 dataset results for Audio