Datasets

10,725 machine learning datasets
Filter by Task
Audio Classification 33 Speech Recognition 29 Music Information Retrieval 21 Automatic Speech Recognition (ASR) 20 Sound Event Detection 19 Music Generation 18 Beat Tracking 14 Speech Emotion Recognition 13 Downbeat Tracking 12 Information Retrieval 11 Speech Enhancement 11 Acoustic Scene Classification 9 Audio Tagging 9 Emotion Recognition 9 Few-Shot Audio Classification 9 Language Modelling 9 Music Transcription 9 Speaker Verification 9 Speech Separation 9 Audio Source Separation 8 Automatic Speech Recognition 8 Data Augmentation 8 Multi-Task Learning 8 Music Source Separation 8 Scene Classification 8 Sound Event Localization and Detection 8 Text-To-Speech Synthesis 8 Audio Generation 7 Speech Synthesis 7 Video Understanding 7 Automatic Phoneme Recognition 6 Bandwidth Extension 6 Emotion Classification 6 Environmental Sound Classification 6 Multimodal Emotion Recognition 6 Action Recognition 5 Audio-Visual Speech Recognition 5 Classification 5 Emotion Recognition in Conversation 5 Spoken Language Understanding 5 Text to Audio Retrieval 5 Anomaly Detection 4 Audio captioning 4 Audio to Text Retrieval 4 Audio-Visual Synchronization 4 Gesture Generation 4 Keyword Spotting 4 Lipreading 4 Multi-Label Classification 4 Multi-instrument Music Transcription 4 Multimodal Deep Learning 4 Multimodal Sentiment Analysis 4 Music Modeling 4 Question Answering 4 Recommendation Systems 4 Video Retrieval 4 Visual Speech Recognition 4 Voice Conversion 4 Zero-shot Audio Classification 4 Action Quality Assessment 3 Audio Emotion Recognition 3 Bird Audio Detection 3 Cross-Modal Retrieval 3 DeepFake Detection 3 Direction of Arrival Estimation 3 Distant Speech Recognition 3 Drum Transcription 3 Facial Expression Recognition (FER) 3 Genre classification 3 Image Classification 3 Instrument Recognition 3 Language Identification 3 Lip Reading 3 Multi-modal Classification 3 Multi-task Audio Source Seperation 3 Music Classification 3 Music Recommendation 3 Music Tagging 3 Object Recognition 3 Online Beat Tracking 3 Online Downbeat Tracking 3 Quantization 3 Robust Speech Recognition 3 Self-Supervised Learning 3 Semantic Segmentation 3 Speaker Diarization 3 Speaker Recognition 3 Speech Denoising 3 Speech Extraction 3 Spoken language identification 3 Style Transfer 3 Talking Face Generation 3 Unconstrained Lip-synchronization 3 Video Emotion Recognition 3 Video Summarization 3 audio-visual learning 3 3D Face Animation 2 3D Object Classification 2 Abstractive Text Summarization 2 Acoustic echo cancellation 2 Activity Recognition 2 Arousal Estimation 2 Audio Deepfake Detection 2 Audio Signal Processing 2 Audio Super-Resolution 2 Chord Recognition 2 Contrastive Learning 2 Depression Detection 2 Domain Adaptation 2 Drum Transcription in Music (DTM) 2 Facial Emotion Recognition 2 Few-Shot Learning 2 Intent Detection 2 Landmark-based Lipreading 2 Lip to Speech Synthesis 2 Multiview Learning 2 Music Auto-Tagging 2 Open Set Learning 2 Resynthesis 2 Scene Understanding 2 Skills Assessment 2 Skills Evaluation 2 Slot Filling 2 Sound Classification 2 Speaker Identification 2 Speaker Separation 2 Speech Dereverberation 2 Speech-to-Text Translation 2 Synthetic Speech Detection 2 Talking Head Generation 2 Target Sound Extraction 2 Text Summarization 2 Text-to-Music Generation 2 Unsupervised Anomaly Detection 2 Valence Estimation 2 Video Captioning 2 Video Classification 2 Video Emotion Detection 2 Visual Keyword Spotting 2 Visual Question Answering (VQA) 2 Voice Cloning 2 Voice Query Recognition 2 Zero-shot Audio Captioning 2 Zero-shot Text to Audio Retrieval 2 automatic-speech-translation 2 1 2D Object Detection 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Human Reconstruction 1 3D Object Detection 1 3D Object Recognition 1 3D Point Cloud Reconstruction 1 Accented Speech Recognition 1 Acoustic Modelling 1 Action Parsing 1 Action Recognition In Videos 1 Action Understanding 1 Active Learning 1 Active Speaker Localization 1 Activity Detection 1 Activity Prediction 1 Adversarial Robustness 1 Anomaly Detection In Surveillance Videos 1 Anxiety Detection 1 Audio Denoising 1 Audio Effects Modeling 1 Audio Fingerprint 1 Audio Multiple Target Classification 1 Audio Quality Assessment 1 Audio Synthesis 1 Audio-visual Question Answering 1 Audio/Video to Text Retrieval 1 Automatic Lyrics Transcription 1 Automatic Sleep Stage Classification 1 Cadenza 1 - Task 1 - Headphone 1 Cadenza 1 - Task 2 - In Car 1 Caller Detection 1 Common Sense Reasoning 1 Conversational Response Generation 1 Cross-Lingual ASR 1 Cross-Lingual POS Tagging 1 Cross-Lingual Transfer 1 Cross-lingual zero-shot dependency parsing 1 Decision Making Under Uncertainty 1 Dense Video Captioning 1 Dependency Parsing 1 Dialog Act Classification 1 Dialogue Act Classification 1 Dialogue Evaluation 1 Dialogue Generation 1 Dimensionality Reduction 1 Directional Hearing 1 Domain Generalization 1 Dominance Estimation 1 ECG Classification 1 ENF (Electric Network Frequency) Detection 1 ENF (Electric Network Frequency) Extraction 1 ENF (Electric Network Frequency) Extraction from Video 1 Emotional Dialogue Acts 1 Environment Sound Classification 1 Event Detection 1 Face Clustering 1 Face Detection 1 Facial Expression Recognition 1 Fact Checking 1 Fake Song Detection 1 Federated Learning 1 Fill Mask 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Gait Recognition 1 Gender Bias Detection 1 Gender Classification 1 Gender Prediction 1 Gunshot Detection 1 Headline Generation 1 Human Interaction Recognition 1 Human Pose Forecasting 1 Humor Detection 1 Image Captioning 1 Image Generation 1 Image Generation from Scene Graphs 1 Image Manipulation 1 Image Retrieval 1 Image-to-Text Retrieval 1 Indoor Localization 1 Intent Classification 1 Intent Discovery 1 Knowledge Graphs 1 Learning with noisy labels 1 Linear evaluation 1 Link Prediction 1 Matrix Completion 1 Meeting Summarization 1 Melody Extraction 1 Metric Learning 1 Mobile Security 1 Motion Synthesis 1 Multi-Label Learning 1 Multi-Source Unsupervised Domain Adaptation 1 Multimodal Abstractive Text Summarization 1 Multimodal Activity Recognition 1 Multimodal Reasoning 1 Multimodal Sleep Stage Detection 1 Multiview Detection 1 Music Captioning 1 Music Emotion Recognition 1 Music Genre Classification 1 Music Performance Rendering 1 Music Quality Assessment 1 Music Style Transfer 1 Named Entity Recognition (NER) 1 Natural Language Inference (Few-Shot) 1 Neural Architecture Search 1 Object Categorization 1 Occluded Face Detection 1 Open Intent Discovery 1 Open-Domain Dialog 1 Open-Ended Question Answering 1 Opinion Mining 1 Optical Flow Estimation 1 Out of Distribution (OOD) Detection 1 Out-of-Distribution Detection 1 Part-Of-Speech Tagging 1 Personality Recognition in Conversation 1 Personality Trait Recognition 1 Personalized and Emotional Conversation 1 Physical Commonsense Reasoning 1 Pitch Classification 1 Pose Estimation 1 Prediction Intervals 1 Question Generation 1 Real-time Directional Hearing 1 Retrieval 1 Retrieval-augmented Few-shot In-context Audio Captioning 1 Robot Manipulation 1 SQL Parsing 1 Saliency Detection 1 Saliency Prediction 1 Sarcasm Detection 1 Scene Graph Detection 1 Scene Recognition 1 Scene-Aware Dialogue 1 Seizure Detection 1 Self-Driving Cars 1 Self-Supervised Audio Classification 1 Semantic Parsing 1 Sentence Embedding 1 Sentiment Analysis 1 Sequential Image Classification 1 Sequential skip prediction 1 Shooter Localization 1 Singer Identification 1 Singing Voice Synthesis 1 Sleep Stage Detection 1 Speech Intent Classification 1 Speech Synthesis - Assamese 1 Speech Synthesis - Bengali 1 Speech Synthesis - Bodo 1 Speech Synthesis - Gujarati 1 Speech Synthesis - Hindi 1 Speech Synthesis - Kannada 1 Speech Synthesis - Malayalam 1 Speech Synthesis - Manipuri 1 Speech Synthesis - Marathi 1 Speech Synthesis - Rajasthani 1 Speech Synthesis - Tamil 1 Speech Synthesis - Telugu 1 Speech-to-Gesture Translation 1 Speech-to-Speech Translation 1 Supervised Video Summarization 1 Synthetic Song Detection 1 Task-Oriented Dialogue Systems 1 Temporal Forgery Localization 1 Text Classification 1 Text Generation 1 Text Segmentation 1 Text to Audio/Video Retrieval 1 Time Offset Calibration 1 Time Series Alignment 1 Time Series Analysis 1 Time Series Averaging 1 Time Series Classification 1 Time Series Clustering 1 Transfer Learning 1 Translation 1 Uncertainty Quantification 1 Unsupervised Video Summarization 1 Video Domain Adapation 1 Video Generation 1 Video Object Segmentation 1 Video Question Answering 1 Video Reconstruction 1 Video Saliency Detection 1 Video Saliency Prediction 1 Video Synchronization 1 Video-Text Retrieval 1 Vocal technique classification 1 Voice Anti-spoofing 1 Word Embeddings 1 Word Translation 1 Zero-Shot Environment Sound Classification 1 Zero-Shot Learning 1 Zero-Shot Video Question Answer 1 Zero-Shot Video Retrieval 1 audio-visual event localization 1 de-en 1 es-en 1 fr-en 1 speech-recognition 1 video narration captioning 1 zero-shot long video question answering 1
Filter by Language
English 148 French 24 German 21 Chinese 20 Spanish 18 Italian 11 Japanese 11 Russian 11 Portuguese 9 Arabic 7 Hindi 7 Dutch 6 Persian 6 Tamil 6 Turkish 5 Vietnamese 5 Catalan 4 Estonian 4 Indonesian 4 Korean 4 Latvian 4 Polish 4 Slovenian 4 Swedish 4 Welsh 4 Czech 3 Greek 3 Mandarin Chinese 3 Mongolian 3 Romanian 3 Ukrainian 3 Assamese 2 Basque 2 Bengali 2 Breton 2 Bulgarian 2 Finnish 2 Fon 2 Hungarian 2 Irish 2 Kazakh 2 Lithuanian 2 Malayalam 2 Maltese 2 Marathi 2 Multilingual 2 Odia 2 Punjabi 2 Slovak 2 Telugu 2 Thai 2 Afrikaans 1 Akkadian 1 Akuntsu 1 Albanian 1 Amharic 1 Ancient Greek 1 Apurinã 1 Armenian 1 Assyrian Neo-Aramaic 1 Bambara 1 Belarusian 1 Bemba (Zambia) 1 Bhojpuri 1 Bodo (India) 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Croatian 1 Danish 1 Dhivehi 1 Erzya 1 Esperanto 1 Faroese 1 Galician 1 Georgian 1 Gothic 1 Gujarati 1 Hakha Chin 1 Hebrew 1 Icelandic 1 Kabyle 1 Kannada 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi-Permyak 1 Komi-Zyrian 1 Latin 1 Literary Chinese 1 Livvi 1 Lozi 1 Lunda 1 Manipuri 1 Manx 1 Mbyá Guaraní 1 Modern Greek 1 Moksha 1 Mundurukú 1 Nayini 1 Nigerian Pidgin 1 Northern Kurdish 1 Northern Sami 1 Norwegian 1 Nyanja 1 Old French 1 Old Russian 1 Old Turkish 1 Quechua 1 Rajasthani 1 Russia Buriat 1 Sanskrit 1 Scottish Gaelic 1 Serbian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swedish Sign Language 1 Swiss German 1 Tagalog 1 Tatar 1 Tonga (Zambia) 1 Tupinambá 1 Uighur 1 Upper Sorbian 1 Urdu 1 Uzbek 1 Votic 1 Warlpiri 1 Wolof 1 Yoruba 1 Yue Chinese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Ambonese Malay 0 American Sign Language 0 Ancient Hebrew 0 Aragonese 0 Argentine Sign Language 0 Arpitan 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Bavarian 0 Bishnupriya 0 Bislama 0 Bosnian 0 Buginese 0 Burmese 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Congo Swahili 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Filipino 0 French Sign Language 0 Friulian 0 Fulah 0 Gagauz 0 Gan Chinese 0 Ganda 0 Geez 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Greek Sign Language 0 Guarani 0 Gulf Arabic 0 Haitian 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Herero 0 Hiri Motu 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kirghiz 0 Komi 0 Kongo 0 Krio 0 Kuanyama 0 Kupang Malay 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Makasar 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayic Dayak 0 Maori 0 Marshallese 0 Mazanderani 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek (1453-) 0 Moroccan Arabic 0 Mossi 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Luri 0 Northern Uzbek 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Occitan (post 1500) 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old Spanish 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Pontic 0 Portuguse 0 Pushto 0 Romansh 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Somali 0 South Azerbaijani 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tajik 0 Tetum 0 Thai Song 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tunisian Sign Language 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Uab Meto 0 Udmurt 0 Venda 0 Venetian 0 Veps 0 Vlaams 0 Vlax Romani 0 Volapük 0 Walloon 0 Waray (Philippines) 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

441 dataset results for Audio