Datasets

11,595 machine learning datasets
Filter by Task
Audio Classification 33 Speech Recognition 33 Music Information Retrieval 24 Automatic Speech Recognition (ASR) 21 Music Generation 21 Sound Event Detection 20 Beat Tracking 14 Speech Emotion Recognition 13 Downbeat Tracking 12 Speech Enhancement 12 Information Retrieval 11 Music Transcription 11 Audio Tagging 10 Language Modelling 10 Acoustic Scene Classification 9 Audio Source Separation 9 Emotion Recognition 9 Few-Shot Audio Classification 9 Music Source Separation 9 Speaker Verification 9 Speech Separation 9 Automatic Speech Recognition 8 Data Augmentation 8 Multi-Task Learning 8 Scene Classification 8 Sound Event Localization and Detection 8 Speech Synthesis 8 Text-To-Speech Synthesis 8 Audio Generation 7 Classification 7 Video Understanding 7 Audio-Visual Speech Recognition 6 Automatic Phoneme Recognition 6 Bandwidth Extension 6 Emotion Classification 6 Environmental Sound Classification 6 Multimodal Emotion Recognition 6 Action Recognition 5 Emotion Recognition in Conversation 5 Lipreading 5 Multi-instrument Music Transcription 5 Music Classification 5 Spoken Language Understanding 5 Text to Audio Retrieval 5 Anomaly Detection 4 Audio captioning 4 Audio to Text Retrieval 4 Audio-Visual Synchronization 4 Gesture Generation 4 Image Classification 4 Keyword Spotting 4 Lip Reading 4 Multi-Label Classification 4 Multimodal Deep Learning 4 Multimodal Sentiment Analysis 4 Music Modeling 4 Music Recommendation 4 Music Tagging 4 Question Answering 4 Recommendation Systems 4 Text-to-Music Generation 4 Video Retrieval 4 Visual Speech Recognition 4 Voice Conversion 4 Zero-shot Audio Classification 4 Action Quality Assessment 3 Audio Emotion Recognition 3 Bird Audio Detection 3 Cross-Modal Retrieval 3 DeepFake Detection 3 Direction of Arrival Estimation 3 Distant Speech Recognition 3 Drum Transcription 3 Facial Expression Recognition (FER) 3 Genre classification 3 Instrument Recognition 3 Language Identification 3 Multi-modal Classification 3 Multi-task Audio Source Seperation 3 Music Auto-Tagging 3 Music Emotion Recognition 3 Object Recognition 3 Online Beat Tracking 3 Online Downbeat Tracking 3 Quantization 3 Robust Speech Recognition 3 Self-Supervised Learning 3 Semantic Segmentation 3 Speaker Diarization 3 Speaker Recognition 3 Speech Denoising 3 Speech Extraction 3 Spoken language identification 3 Style Transfer 3 Synthetic Speech Detection 3 Talking Face Generation 3 Text Summarization 3 Unconstrained Lip-synchronization 3 Video Emotion Recognition 3 Video Summarization 3 audio-visual learning 3 3D Face Animation 2 3D Object Classification 2 Abstractive Text Summarization 2 Acoustic echo cancellation 2 Activity Recognition 2 Arousal Estimation 2 Audio Deepfake Detection 2 Audio Signal Processing 2 Audio Super-Resolution 2 Chord Recognition 2 Contrastive Learning 2 Depression Detection 2 Domain Adaptation 2 Drum Transcription in Music (DTM) 2 Environment Sound Classification 2 Facial Emotion Recognition 2 Few-Shot Learning 2 Intent Detection 2 Landmark-based Lipreading 2 Lip to Speech Synthesis 2 Multiview Learning 2 Music Captioning 2 Music Genre Classification 2 Music Genre Recognition 2 Open Set Learning 2 Resynthesis 2 Scene Understanding 2 Skills Assessment 2 Skills Evaluation 2 Slot Filling 2 Sound Classification 2 Speaker Identification 2 Speaker Separation 2 Speech Dereverberation 2 Speech-to-Text Translation 2 Talking Head Generation 2 Target Sound Extraction 2 Unsupervised Anomaly Detection 2 Valence Estimation 2 Video Captioning 2 Video Classification 2 Video Emotion Detection 2 Video Question Answering 2 Visual Keyword Spotting 2 Visual Question Answering (VQA) 2 Voice Cloning 2 Voice Query Recognition 2 Zero-Shot Audio Retrieval 2 Zero-Shot Video Question Answer 2 Zero-shot Audio Captioning 2 Zero-shot Text to Audio Retrieval 2 automatic-speech-translation 2 1 2D Object Detection 1 2D Panoptic Segmentation 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Human Reconstruction 1 3D Object Detection 1 3D Object Recognition 1 3D Panoptic Segmentation 1 3D Point Cloud Reconstruction 1 4D Panoptic Segmentation 1 Accented Speech Recognition 1 Acoustic Modelling 1 Action Anticipation 1 Action Parsing 1 Action Recognition In Videos 1 Action Understanding 1 Active Learning 1 Active Speaker Localization 1 Activity Detection 1 Activity Prediction 1 Adversarial Robustness 1 Anomaly Detection In Surveillance Videos 1 Anxiety Detection 1 Audio Denoising 1 Audio Effects Modeling 1 Audio Fingerprint 1 Audio Multiple Target Classification 1 Audio Quality Assessment 1 Audio Synthesis 1 Audio-visual Question Answering 1 Audio/Video to Text Retrieval 1 Automatic Lyrics Transcription 1 Automatic Sleep Stage Classification 1 Cadenza 1 - Task 1 - Headphone 1 Cadenza 1 - Task 2 - In Car 1 Caller Detection 1 Clustering 1 Common Sense Reasoning 1 Conversational Response Generation 1 Cross-Lingual ASR 1 Cross-Lingual POS Tagging 1 Cross-Lingual Transfer 1 Cross-lingual zero-shot dependency parsing 1 Decision Making Under Uncertainty 1 Dense Video Captioning 1 Dependency Parsing 1 Dialog Act Classification 1 Dialogue Act Classification 1 Dialogue Evaluation 1 Dialogue Generation 1 Dimensionality Reduction 1 Directional Hearing 1 Domain Generalization 1 Dominance Estimation 1 ECG Classification 1 ENF (Electric Network Frequency) Detection 1 ENF (Electric Network Frequency) Extraction 1 ENF (Electric Network Frequency) Extraction from Video 1 Emotional Dialogue Acts 1 Event Detection 1 Face Clustering 1 Face Detection 1 Facial Expression Recognition 1 Fact Checking 1 Fake Song Detection 1 Federated Learning 1 Fill Mask 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Gait Recognition 1 Gender Bias Detection 1 Gender Classification 1 Gender Prediction 1 Gunshot Detection 1 Hate Speech Detection 1 Headline Generation 1 Human Interaction Recognition 1 Human Pose Forecasting 1 Humor Detection 1 Image Captioning 1 Image Generation 1 Image Generation from Scene Graphs 1 Image Manipulation 1 Image Retrieval 1 Indoor Localization 1 Intent Classification 1 Intent Discovery 1 Knowledge Graphs 1 Learning with noisy labels 1 Linear evaluation 1 Link Prediction 1 Matrix Completion 1 Meeting Summarization 1 Melody Extraction 1 Metric Learning 1 Mobile Security 1 Motion Synthesis 1 Multi-Label Learning 1 Multi-Source Unsupervised Domain Adaptation 1 Multimodal Abstractive Text Summarization 1 Multimodal Activity Recognition 1 Multimodal Reasoning 1 Multimodal Sleep Stage Detection 1 Multiple-choice 1 Multiview Detection 1 Music Compression 1 Music Genre Transfer 1 Music Performance Rendering 1 Music Quality Assessment 1 Music Question Answering 1 Music Style Transfer 1 Named Entity Recognition (NER) 1 Natural Language Inference (Few-Shot) 1 Neural Architecture Search 1 Object Categorization 1 Occluded Face Detection 1 Open Intent Discovery 1 Open-Domain Dialog 1 Open-Ended Question Answering 1 Opinion Mining 1 Optical Flow Estimation 1 Out of Distribution (OOD) Detection 1 Out-of-Distribution Detection 1 Part-Of-Speech Tagging 1 Personality Recognition in Conversation 1 Personality Trait Recognition 1 Personalized and Emotional Conversation 1 Physical Attribute Prediction 1 Physical Commonsense Reasoning 1 Piano Music Modeling 1 Pitch Classification 1 Pose Estimation 1 Prediction Intervals 1 Question Generation 1 Real-time Directional Hearing 1 Retrieval 1 Retrieval-augmented Few-shot In-context Audio Captioning 1 Robot Manipulation 1 SQL Parsing 1 Saliency Detection 1 Saliency Prediction 1 Sarcasm Detection 1 Scene Graph Detection 1 Scene Graph Generation 1 Scene Recognition 1 Scene-Aware Dialogue 1 Seizure Detection 1 Self-Driving Cars 1 Self-Supervised Audio Classification 1 Semantic Parsing 1 Sentence Embedding 1 Sentiment Analysis 1 Sequential Image Classification 1 Sequential skip prediction 1 Shooter Localization 1 Singer Identification 1 Singing Voice Synthesis 1 Sleep Stage Detection 1 Speech Intent Classification 1 Speech Synthesis - Assamese 1 Speech Synthesis - Bengali 1 Speech Synthesis - Bodo 1 Speech Synthesis - Gujarati 1 Speech Synthesis - Hindi 1 Speech Synthesis - Kannada 1 Speech Synthesis - Malayalam 1 Speech Synthesis - Manipuri 1 Speech Synthesis - Marathi 1 Speech Synthesis - Rajasthani 1 Speech Synthesis - Tamil 1 Speech Synthesis - Telugu 1 Speech-to-Gesture Translation 1 Speech-to-Phoneme 1 Speech-to-Speech Translation 1 Spoken Dialogue Systems 1 Supervised Video Summarization 1 Surgical phase recognition 1 Synthetic Song Detection 1 Task-Oriented Dialogue Systems 1 Temporal Forgery Localization 1 Text Classification 1 Text Generation 1 Text Segmentation 1 Text to Audio/Video Retrieval 1 Time Offset Calibration 1 Time Series Alignment 1 Time Series Analysis 1 Time Series Averaging 1 Time Series Classification 1 Time Series Clustering 1 Transfer Learning 1 Translation 1 Uncertainty Quantification 1 Unsupervised Video Summarization 1 Video Domain Adapation 1 Video Generation 1 Video Object Segmentation 1 Video Panoptic Segmentation 1 Video Reconstruction 1 Video Saliency Detection 1 Video Saliency Prediction 1 Video Segmentation 1 Video Synchronization 1 Video-Text Retrieval 1 Video-to-Sound Generation 1 Vocal technique classification 1 Voice Anti-spoofing 1 Voice pathology detection 1 Word Embeddings 1 Word Translation 1 Zero-Shot Environment Sound Classification 1 Zero-Shot Learning 1 Zero-Shot Video Retrieval 1 audio-visual event localization 1 de-en 1 es-en 1 fr-en 1 video narration captioning 1 zero-shot long video question answering 1
Filter by Language
English 159 French 24 Chinese 23 German 23 Spanish 21 Japanese 12 Russian 12 Italian 11 Portuguese 9 Arabic 7 Hindi 7 Dutch 6 Persian 6 Tamil 6 Turkish 5 Vietnamese 5 Catalan 4 Estonian 4 Indonesian 4 Korean 4 Latvian 4 Polish 4 Slovenian 4 Swedish 4 Ukrainian 4 Welsh 4 Czech 3 Greek 3 Mandarin Chinese 3 Mongolian 3 Romanian 3 Assamese 2 Basque 2 Bengali 2 Breton 2 Bulgarian 2 Finnish 2 Fon 2 Hungarian 2 Irish 2 Kazakh 2 Lithuanian 2 Malayalam 2 Maltese 2 Marathi 2 Multilingual 2 Odia 2 Punjabi 2 Slovak 2 Telugu 2 Thai 2 Afrikaans 1 Akkadian 1 Akuntsu 1 Albanian 1 Amharic 1 Ancient Greek 1 Apurinã 1 Armenian 1 Assyrian Neo-Aramaic 1 Bambara 1 Belarusian 1 Bemba (Zambia) 1 Bhojpuri 1 Bodo (India) 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Croatian 1 Danish 1 Dhivehi 1 Erzya 1 Esperanto 1 Faroese 1 Galician 1 Georgian 1 Gothic 1 Gujarati 1 Hakha Chin 1 Hebrew 1 Icelandic 1 Kabyle 1 Kannada 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi-Permyak 1 Komi-Zyrian 1 Latin 1 Literary Chinese 1 Livvi 1 Lozi 1 Lunda 1 Manipuri 1 Manx 1 Mbyá Guaraní 1 Modern Greek 1 Moksha 1 Mundurukú 1 Nayini 1 Nigerian Pidgin 1 Northern Kurdish 1 Northern Sami 1 Norwegian 1 Nyanja 1 Old French 1 Old Russian 1 Old Turkish 1 Oromo 1 Quechua 1 Rajasthani 1 Russia Buriat 1 Sanskrit 1 Scottish Gaelic 1 Serbian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swedish Sign Language 1 Swiss German 1 Tagalog 1 Tatar 1 Tonga (Zambia) 1 Tupinambá 1 Uighur 1 Upper Sorbian 1 Urdu 1 Uzbek 1 Votic 1 Warlpiri 1 Wolof 1 Yoruba 1 Yue Chinese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 Ambonese Malay 0 American Sign Language 0 Ancient Hebrew 0 Andaman Creole Hindi 0 Aragonese 0 Argentine Sign Language 0 Arpitan 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Bavarian 0 Bishnupriya 0 Bislama 0 Bosnian 0 Buginese 0 Burmese 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Congo Swahili 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Filipino 0 French Sign Language 0 Friulian 0 Fulah 0 Gagauz 0 Gan Chinese 0 Ganda 0 Geez 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Greek Sign Language 0 Guarani 0 Gulf Arabic 0 Haitian 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Herero 0 Hiri Motu 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kirghiz 0 Komi 0 Kongo 0 Krio 0 Kuanyama 0 Kupang Malay 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Lingua Franca 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Makasar 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayic Dayak 0 Maori 0 Marshallese 0 Mazanderani 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek (1453-) 0 Moroccan Arabic 0 Mossi 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Luri 0 Northern Uzbek 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Occitan (post 1500) 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old Spanish 0 Oriya (macrolanguage) 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Pontic 0 Portuguse 0 Pushto 0 Romansh 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Somali 0 South Azerbaijani 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tajik 0 Tetum 0 Thai Song 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tunisian Sign Language 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Uab Meto 0 Udmurt 0 Venda 0 Venetian 0 Veps 0 Vlaams 0 Vlax Romani 0 Volapük 0 Walloon 0 Waray (Philippines) 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

464 dataset results for Audio