Datasets

9,740 machine learning datasets
Filter by Task
Action Recognition 15 Video Question Answering 14 Multi-Object Tracking 12 Object Tracking 12 Video Captioning 11 Visual Question Answering (VQA) 11 Question Answering 10 Object Detection 9 Video Retrieval 9 Pose Estimation 8 Ad-hoc video search 7 Video Grounding 7 Video Understanding 7 2D Object Detection 6 Visual Object Tracking 6 Action Detection 5 Video Object Tracking 5 3D Action Recognition 4 3D Human Pose Estimation 4 3D Object Tracking 4 3D Pose Estimation 4 Action Segmentation 4 Action Triplet Recognition 4 Anomaly Detection 4 Emotion Classification 4 Emotion Recognition in Conversation 4 Facial Emotion Recognition 4 Moment Retrieval 4 Semantic Segmentation 4 Temporal Action Localization 4 Text-to-Video Generation 4 Video Classification 4 Video Emotion Recognition 4 Zero-Shot Video Question Answer 4 Zero-Shot Video Retrieval 4 2D Semantic Segmentation 3 3D Classification 3 3D Lane Detection 3 Activity Recognition 3 Anomaly Detection In Surveillance Videos 3 Audio Classification 3 Audio-Visual Synchronization 3 DeepFake Detection 3 Dense Video Captioning 3 Emotion Recognition 3 Facial Expression Recognition (FER) 3 Instance Segmentation 3 Multi-Label Learning 3 Multimodal Sentiment Analysis 3 Object Localization 3 Speech Emotion Recognition 3 Surgical tool detection 3 Text Generation 3 Text to Video Retrieval 3 Video Polyp Segmentation 3 Video Quality Assessment 3 Video Recognition 3 Video Summarization 3 Video-Text Retrieval 3 Visual Tracking 3 2D Pose Estimation 2 3D Hand Pose Estimation 2 3D Object Classification 2 3D Object Detection 2 3D Object Detection From Stereo Images 2 3D Object Recognition 2 6D Pose Estimation 2 Action Anticipation 2 Action Classification 2 Animal Action Recognition 2 Animal Pose Estimation 2 Arousal Estimation 2 Audio-Visual Speech Recognition 2 Camouflaged Object Segmentation 2 Depression Detection 2 Face Verification 2 Gaze Prediction 2 Homography Estimation 2 Human Pose Forecasting 2 Image Classification 2 Image Retrieval 2 Information Retrieval 2 Interactive Segmentation 2 Kinematic Based Workflow Recognition 2 Multi-Animal Tracking with identification 2 Multi-Label Classification 2 Multi-Task Learning 2 Multi-object discovery 2 Multiple Object Tracking 2 Object Counting 2 Object Recognition 2 Online Multi-Object Tracking 2 Open Vocabulary Action Recognition 2 Open World Object Detection 2 Pose Tracking 2 Real-Time Multi-Object Tracking 2 Real-Time Object Detection 2 Robust Object Detection 2 Scene Graph Generation 2 Scene Understanding 2 Spatio-Temporal Action Localization 2 Speech Recognition 2 Surgical Gesture Recognition 2 Temporal Forgery Localization 2 Text-to-Image Generation 2 Valence Estimation 2 Video Alignment 2 Video Anomaly Detection 2 Video Denoising 2 Video Emotion Detection 2 Video Instance Segmentation 2 Video Object Detection 2 Video Segmentation 2 Video Synchronization 2 Video Visual Relation Detection 2 Video Visual Relation Tagging 2 Video scene graph generation 2 Video-to-image Affordance Grounding 2 Vision and Language Navigation 2 Zero-Shot Composed Image Retrieval (ZS-CIR) 2 Zero-Shot Learning 2 Zero-Shot Object Detection 2 audio-visual learning 2 2D Human Pose Estimation 1 2D Semantic Segmentation task 3 (25 classes) 1 3D Car Instance Understanding 1 3D Depth Estimation 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Human Pose Tracking 1 3D Human Reconstruction 1 3D Multi-Object Tracking 1 3D Object Detection From Monocular Images 1 3D Object Reconstruction 1 3D Object Reconstruction From A Single Image 1 3D Object Retrieval 1 3D Point Cloud Reconstruction 1 3D Reconstruction 1 3D Scene Reconstruction 1 3D Shape Reconstruction 1 6D Pose Estimation using RGB 1 6D Pose Estimation using RGBD 1 Abnormal Event Detection In Video 1 Abstractive Text Summarization 1 Action Localization 1 Action Parsing 1 Action Recognition In Videos 1 Action Triplet Detection 1 Active Object Detection 1 Active Speaker Localization 1 Activeness Detection 1 Activity Detection 1 Activity Prediction 1 Amodal Tracking 1 Anxiety Detection 1 Audio Emotion Recognition 1 Audio Source Separation 1 Audio-visual Question Answering 1 Autonomous Driving 1 Boundary Captioning 1 Boundary Grounding 1 Box-supervised Instance Segmentation 1 Change Detection 1 Class-agnostic Object Detection 1 Classification 1 Clinical Concept Extraction 1 Color Mismatch Correction 1 Colorectal Polyps Characterization 1 Composed Image Retrieval (CoIR) 1 Composed Video Retrieval (CoVR) 1 Conditional Image Generation 1 Conversational Web Navigation 1 Cross-Modal Retrieval 1 Cross-domain 3D Human Pose Estimation 1 Deblurring 1 Dialogue Act Classification 1 Dominance Estimation 1 Driver Attention Monitoring 1 Drone Pose Estimation 1 English Conversational Speech Recognition 1 Event Detection 1 Event Segmentation 1 Face Anti-Spoofing 1 Face Clustering 1 Face Detection 1 Face Presentation Attack Detection 1 Face Recognition 1 Facial Action Unit Detection 1 Facial Expression Translation 1 Facial Landmark Detection 1 Facial expression generation 1 Few Shot Action Recognition 1 Few Shot Open Set Object Detection 1 Few-Shot Object Detection 1 Fine-Grained Vehicle Classification 1 Gaze Estimation 1 Gaze Target Estimation 1 General Classification 1 Generalized Zero-Shot Object Detection 1 Gesture Generation 1 Hand Gesture Recognition 1 Hand Pose Estimation 1 Headline Generation 1 Heart rate estimation 1 Highlight Detection 1 Human Activity Recognition 1 Human Detection of Deepfakes 1 Human-Object Interaction Anticipation 1 Human-Object Interaction Detection 1 Image Captioning 1 Image Dehazing 1 Image Generation from Scene Graphs 1 Image Outpainting 1 Image-level Supervised Instance Segmentation 1 Image-to-Text Retrieval 1 Indoor Localization 1 Interactive Video Object Segmentation 1 Keypoint Detection 1 Knowledge Distillation 1 Lane Detection 1 Language-Based Temporal Localization 1 Layout-to-Image Generation 1 License Plate Detection 1 License Plate Recognition 1 Lip password classification 1 Logical Reasoning Question Answering 1 Long Video Retrieval (Background Removed) 1 Long-tail Learning 1 MULTI-VIEW LEARNING 1 Medical Diagnosis 1 Medical Image Registration 1 Medical Image Segmentation 1 Medical Object Detection 1 Meeting Summarization 1 Mistake Detection 1 Monocular Visual Odometry 1 Motion Segmentation 1 Motion Synthesis 1 Moving Object Detection 1 Multi-Instance Retrieval 1 Multi-Label Image Classification 1 Multi-Object Tracking and Segmentation 1 Multi-Person Pose Estimation 1 Multi-View 3D Reconstruction 1 Multi-task Audio Source Seperation 1 Multimodal Abstractive Text Summarization 1 Multimodal Activity Recognition 1 Multimodal Association 1 Multimodal Deep Learning 1 Multimodal Emotion Recognition 1 Multimodal GIF Dialog 1 Multimodal Reasoning 1 Multiple Instance Learning 1 Multiple People Tracking 1 Multiview Detection 1 Music Emotion Recognition 1 Music Recommendation 1 Natural Language Inference 1 Natural Language Moment Retrieval 1 Natural Language Queries 1 Natural Language Understanding 1 Natural Language Visual Grounding 1 Novel View Synthesis 1 Object Categorization 1 Object Discovery 1 Object Proposal Generation 1 Occluded Face Detection 1 Occlusion Handling 1 Offline surgical phase recognition 1 One-Shot Instance Segmentation 1 One-Shot Object Detection 1 Online surgical phase recognition 1 Open Vocabulary Object Detection 1 Open-World Instance Segmentation 1 Open-set video tagging 1 Organ Detection 1 Panoptic Segmentation 1 Paraphrase Generation 1 Persuasion Strategies 1 Photoplethysmography (PPG) heart rate estimation 1 Point-Supervised Instance Segmentation 1 Pose Prediction 1 Prosody Prediction 1 Quantization 1 Question Generation 1 Reading Comprehension 1 Real-Time Visual Tracking 1 Real-time Instance Segmentation 1 Recognizing Emotion Cause in Conversations 1 Referring Expression 1 Referring Expression Segmentation 1 Referring Video Object Segmentation 1 Region Proposal 1 Sarcasm Detection 1 Scene Change Detection 1 Scene Graph Detection 1 Scene Text Recognition 1 Scene-Aware Dialogue 1 Segmentation Based Workflow Recognition 1 Self-Supervised Learning 1 Semantic correspondence 1 Semi Supervised Learning for Image Captioning 1 Semi-Supervised Video Object Segmentation 1 Sensor Fusion 1 Sign Language Recognition 1 Sign Language Translation 1 Simultaneous Localization and Mapping 1 Single-View 3D Reconstruction 1 Single-object discovery 1 Small Object Detection 1 Speaker Separation 1 Speech Enhancement 1 Speech Extraction 1 Speech Separation 1 Speech Synthesis 1 Speech-to-Gesture Translation 1 Story Visualization 1 Supervised Video Summarization 1 Surgical phase recognition 1 Temporal/Casual QA 1 Text Segmentation 1 Text Summarization 1 Text-to-video search 1 Thermal Infrared Object Tracking 1 Trajectory Forecasting 1 Unsupervised Domain Adaptation 1 Unsupervised Instance Segmentation 1 Unsupervised Object Detection 1 Unsupervised Object Localization 1 Unsupervised Semantic Segmentation 1 Unsupervised Semantic Segmentation with Language-image Pre-training 1 Unsupervised Video Summarization 1 Vehicle Pose Estimation 1 Vehicle Re-Identification 1 Video & Kinematic Base Workflow Recognition 1 Video Based Workflow Recognition 1 Video Frame Interpolation 1 Video Generation 1 Video Inpainting 1 Video Object Segmentation 1 Video Prediction 1 Video Saliency Prediction 1 Video Story QA 1 Video Super-Resolution 1 Video, Kinematic & Segmentation Base Workflow Recognition 1 Video-based Generative Performance Benchmarking 1 Video-based Generative Performance Benchmarking (Consistency) 1 Video-based Generative Performance Benchmarking (Contextual Understanding) 1 Video-based Generative Performance Benchmarking (Correctness of Information) 1 Video-based Generative Performance Benchmarking (Detail Orientation)) 1 Video-based Generative Performance Benchmarking (Temporal Understanding) 1 Vision-Language Navigation 1 Visual Odometry 1 Visual Question Answering 1 Weakly Supervised Action Segmentation (Transcript) 1 Weakly Supervised Object Detection 1 Weakly-supervised Anomaly Detection 1 Weakly-supervised instance segmentation 1 Wikipedia Summarization 1 Zero-Shot Cross-Modal Retrieval 1 Zero-shot Moment Retrieval 1 Zero-shot dense video captioning 1 audio-visual event localization 1 drone-based object tracking 1 object-detection 1
Filter by Language (clear)
English Chinese 18 German 8 French 4 Hindi 4 Spanish 4 American Sign Language 3 Portuguese 3 Russian 3 Japanese 2 Korean 2 Multilingual 2 Turkish Sign Language 2 Arabic 1 Argentine Sign Language 1 Bengali 1 Greek 1 Greek Sign Language 1 Indonesian 1 Italian 1 Mandarin Chinese 1 Swiss German 1 Swiss-German Sign Language 1 Telugu 1 Turkish 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Afrikaans 0 Akan 0 Akkadian 0 Akuntsu 0 Albanian 0 Amharic 0 Ancient Greek 0 Ancient Hebrew 0 Apurinã 0 Aragonese 0 Armenian 0 Arpitan 0 Assamese 0 Assyrian Neo-Aramaic 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bambara 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Basque 0 Bavarian 0 Belarusian 0 Bemba (Zambia) 0 Bhojpuri 0 Bishnupriya 0 Bislama 0 Bodo (India) 0 Bosnian 0 Breton 0 Buginese 0 Bulgarian 0 Burmese 0 Catalan 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Chukot 0 Church Slavic 0 Chuvash 0 Congo Swahili 0 Coptic 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Croatian 0 Czech 0 Danish 0 Dhivehi 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dutch 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Erzya 0 Esperanto 0 Estonian 0 Ewe 0 Extremaduran 0 Faroese 0 Fiji Hindi 0 Fijian 0 Filipino 0 Finnish 0 Fon 0 Friulian 0 Fulah 0 Gagauz 0 Galician 0 Gan Chinese 0 Ganda 0 Geez 0 Georgian 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Gothic 0 Guarani 0 Gujarati 0 Gulf Arabic 0 Haitian 0 Hakha Chin 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Hebrew 0 Herero 0 Hiri Motu 0 Hungarian 0 Icelandic 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Irish 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kabyle 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kannada 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Karelian 0 Kashmiri 0 Kashubian 0 Kazakh 0 Khunsari 0 Kikuyu 0 Kinyarwanda 0 Kirghiz 0 Komi 0 Komi-Permyak 0 Komi-Zyrian 0 Kongo 0 Krio 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Latin 0 Latvian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Literary Chinese 0 Lithuanian 0 Livvi 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Lozi 0 Lunda 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayalam 0 Maltese 0 Manipuri 0 Manx 0 Maori 0 Marathi 0 Marshallese 0 Mazanderani 0 Mbyá Guaraní 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek 0 Modern Greek (1453-) 0 Moksha 0 Mongolian 0 Moroccan Arabic 0 Mundurukú 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Nayini 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 Nigerian Pidgin 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Kurdish 0 Northern Luri 0 Northern Sami 0 Northern Uzbek 0 Norwegian 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Nyanja 0 Occitan (post 1500) 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old French 0 Old Russian 0 Old Turkish 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Persian 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Polish 0 Pontic 0 Portuguse 0 Punjabi 0 Pushto 0 Quechua 0 Rajasthani 0 Romanian 0 Romansh 0 Rundi 0 Russia Buriat 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Sanskrit 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Scottish Gaelic 0 Serbian 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Skolt Sami 0 Slovak 0 Slovenian 0 Soi 0 Somali 0 South Azerbaijani 0 South Levantine Arabic 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swedish 0 Swedish Sign Language 0 Tagalog 0 Tahitian 0 Tai 0 Tajik 0 Tamil 0 Tatar 0 Tetum 0 Thai 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tonga (Zambia) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tupinambá 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Uighur 0 Ukrainian 0 Upper Sorbian 0 Urdu 0 Uzbek 0 Venda 0 Venetian 0 Veps 0 Vietnamese 0 Vlaams 0 Vlax Romani 0 Volapük 0 Votic 0 Walloon 0 Waray (Philippines) 0 Warlpiri 0 Welsh 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wolof 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Yoruba 0 Yue Chinese 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

251 dataset results for Videos AND English