Datasets

9,496 machine learning datasets
Filter by Task
Speech Recognition 14 Audio Classification 7 Speech Enhancement 7 Speech Emotion Recognition 5 Speech Separation 5 Spoken Language Understanding 5 Automatic Speech Recognition 4 Automatic Speech Recognition (ASR) 4 Emotion Recognition 4 Gesture Generation 4 Audio to Text Retrieval 3 Classification 3 Distant Speech Recognition 3 Emotion Classification 3 Language Modelling 3 Slot Filling 3 Speech Denoising 3 Speech Extraction 3 Speech Synthesis 3 Text to Audio Retrieval 3 Text-To-Speech Synthesis 3 3D Face Animation 2 3D Object Classification 2 Arousal Estimation 2 Audio Emotion Recognition 2 Audio Source Separation 2 Audio-Visual Speech Recognition 2 Audio-Visual Synchronization 2 Automatic Phoneme Recognition 2 Depression Detection 2 Emotion Recognition in Conversation 2 Facial Emotion Recognition 2 Facial Expression Recognition (FER) 2 Few-Shot Audio Classification 2 Intent Detection 2 Intent Discovery 2 Keyword Spotting 2 Language Identification 2 Multimodal Sentiment Analysis 2 Music Information Retrieval 2 Object Recognition 2 Open Intent Discovery 2 Out of Distribution (OOD) Detection 2 Question Answering 2 Resynthesis 2 Robust Speech Recognition 2 Speech Dereverberation 2 Valence Estimation 2 Video Emotion Recognition 2 Visual Question Answering (VQA) 2 2D Object Detection 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Human Reconstruction 1 3D Object Detection 1 3D Object Recognition 1 Abstractive Text Summarization 1 Accented Speech Recognition 1 Action Parsing 1 Action Recognition 1 Active Speaker Localization 1 Activity Detection 1 Activity Prediction 1 Activity Recognition 1 Anomaly Detection In Surveillance Videos 1 Anxiety Detection 1 Audio Fingerprint 1 Audio Signal Processing 1 Audio Super-Resolution 1 Audio Synthesis 1 Audio Tagging 1 Audio captioning 1 Audio-visual Question Answering 1 Automatic Lyrics Transcription 1 Bandwidth Extension 1 Chord Recognition 1 Cross-Lingual ASR 1 Cross-Lingual POS Tagging 1 Cross-Lingual Transfer 1 Cross-Modal Retrieval 1 Cross-lingual zero-shot dependency parsing 1 Data Augmentation 1 DeepFake Detection 1 Dependency Parsing 1 Directional Hearing 1 Dominance Estimation 1 ECG Classification 1 Face Clustering 1 Headline Generation 1 Human Pose Forecasting 1 Image Captioning 1 Image Classification 1 Image Generation from Scene Graphs 1 Image Retrieval 1 Image-to-Text Retrieval 1 Intent Classification 1 LABELED_DEPENDENCIES 1 LEMMA 1 MORPH 1 Meeting Summarization 1 Mobile Security 1 Motion Synthesis 1 Multi-Label Classification 1 Multi-Label Learning 1 Multi-Task Learning 1 Multi-modal Classification 1 Multi-task Audio Source Seperation 1 Multimodal Abstractive Text Summarization 1 Multimodal Activity Recognition 1 Multimodal Deep Learning 1 Multimodal Reasoning 1 Multiview Detection 1 Music Captioning 1 Music Emotion Recognition 1 Music Recommendation 1 Music Tagging 1 Music Transcription 1 Named Entity Recognition (NER) 1 Object Categorization 1 Occluded Face Detection 1 POS 1 Part-Of-Speech Tagging 1 Pose Estimation 1 Question Generation 1 Real-time Directional Hearing 1 Robot Manipulation 1 SENTS 1 SQL Parsing 1 Sarcasm Detection 1 Scene Graph Detection 1 Scene Understanding 1 Scene-Aware Dialogue 1 Semantic Parsing 1 Sequential Image Classification 1 Sound Event Detection 1 Sound Event Localization and Detection 1 Speaker Diarization 1 Speaker Identification 1 Speaker Separation 1 Speech Intent Classification 1 Speech-to-Gesture Translation 1 Speech-to-Speech Translation 1 Speech-to-Text Translation 1 Spoken language identification 1 Supervised Video Summarization 1 Synthetic Speech Detection 1 TAG 1 Temporal Forgery Localization 1 Text Generation 1 Text Segmentation 1 Text Summarization 1 Text-to-Music Generation 1 Time Series Averaging 1 Time Series Classification 1 Time Series Clustering 1 Token Classification 1 Translation 1 UNLABELED_DEPENDENCIES 1 Unsupervised Video Summarization 1 Urdu Speech Recognition 1 Video Classification 1 Video Emotion Detection 1 Video Retrieval 1 Video Summarization 1 Video Synchronization 1 Video Understanding 1 Video-Text Retrieval 1 Visual Speech Recognition 1 Voice Anti-spoofing 1 Voice Conversion 1 Voice Query Recognition 1 Wikipedia Summarization 1 Word Translation 1 Zero-Shot Learning 1 Zero-shot Audio Captioning 1 Zero-shot Text to Audio Retrieval 1 audio-visual event localization 1 audio-visual learning 1 speech-recognition 1
Filter by Language (clear)
English German 19 Chinese 18 French 15 Spanish 15 Russian 11 Italian 10 Japanese 10 Portuguese 9 Arabic 7 Hindi 7 Dutch 6 Persian 6 Tamil 6 Turkish 5 Catalan 4 Estonian 4 Indonesian 4 Latvian 4 Slovenian 4 Swedish 4 Welsh 4 Czech 3 Greek 3 Korean 3 Mandarin Chinese 3 Mongolian 3 Polish 3 Romanian 3 Ukrainian 3 Vietnamese 3 Assamese 2 Basque 2 Bengali 2 Breton 2 Bulgarian 2 Finnish 2 Fon 2 Hungarian 2 Irish 2 Kazakh 2 Lithuanian 2 Malayalam 2 Maltese 2 Marathi 2 Multilingual 2 Odia 2 Punjabi 2 Slovak 2 Telugu 2 Thai 2 Afrikaans 1 Akkadian 1 Akuntsu 1 Albanian 1 Amharic 1 Ancient Greek 1 Apurinã 1 Armenian 1 Assyrian Neo-Aramaic 1 Bambara 1 Belarusian 1 Bemba (Zambia) 1 Bhojpuri 1 Bodo (India) 1 Chukot 1 Church Slavic 1 Chuvash 1 Coptic 1 Croatian 1 Danish 1 Dhivehi 1 Erzya 1 Esperanto 1 Faroese 1 Galician 1 Georgian 1 Gothic 1 Gujarati 1 Hakha Chin 1 Hebrew 1 Icelandic 1 Kabyle 1 Kannada 1 Karelian 1 Khunsari 1 Kinyarwanda 1 Komi-Permyak 1 Komi-Zyrian 1 Latin 1 Literary Chinese 1 Livvi 1 Lozi 1 Lunda 1 Manipuri 1 Manx 1 Mbyá Guaraní 1 Modern Greek 1 Moksha 1 Mundurukú 1 Nayini 1 Nigerian Pidgin 1 Northern Kurdish 1 Northern Sami 1 Norwegian 1 Nyanja 1 Old French 1 Old Russian 1 Old Turkish 1 Quechua 1 Rajasthani 1 Russia Buriat 1 Sanskrit 1 Scottish Gaelic 1 Serbian 1 Skolt Sami 1 Soi 1 South Levantine Arabic 1 Swedish Sign Language 1 Swiss German 1 Tagalog 1 Tatar 1 Tonga (Zambia) 1 Tupinambá 1 Uighur 1 Upper Sorbian 1 Urdu 1 Uzbek 1 Votic 1 Warlpiri 1 Wolof 1 Yoruba 1 Yue Chinese 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Akan 0 American Sign Language 0 Ancient Hebrew 0 Aragonese 0 Argentine Sign Language 0 Arpitan 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Bavarian 0 Bishnupriya 0 Bislama 0 Bosnian 0 Buginese 0 Burmese 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Congo Swahili 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Ewe 0 Extremaduran 0 Fiji Hindi 0 Fijian 0 Filipino 0 Friulian 0 Fulah 0 Gagauz 0 Gan Chinese 0 Ganda 0 Geez 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Greek Sign Language 0 Guarani 0 Gulf Arabic 0 Haitian 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Herero 0 Hiri Motu 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Kashmiri 0 Kashubian 0 Kikuyu 0 Kirghiz 0 Komi 0 Kongo 0 Krio 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Maori 0 Marshallese 0 Mazanderani 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek (1453-) 0 Moroccan Arabic 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Luri 0 Northern Uzbek 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Occitan (post 1500) 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Pontic 0 Portuguse 0 Pushto 0 Romansh 0 Rundi 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Somali 0 South Azerbaijani 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swiss-German Sign Language 0 Tahitian 0 Tai 0 Tajik 0 Tetum 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Turkish Sign Language 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Venda 0 Venetian 0 Veps 0 Vlaams 0 Vlax Romani 0 Volapük 0 Walloon 0 Waray (Philippines) 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

125 dataset results for Audio AND English