Datasets

11,053 machine learning datasets
Filter by Task
Action Recognition 77 Video Understanding 38 Object Tracking 36 Object Detection 33 Video Question Answering 33 Video Captioning 31 Pose Estimation 30 Temporal Action Localization 30 Video Retrieval 29 Multi-Object Tracking 27 Semantic Segmentation 23 Action Detection 22 Question Answering 22 Visual Question Answering (VQA) 21 Action Classification 20 Video Classification 20 Visual Object Tracking 20 Sign Language Recognition 19 Visual Tracking 17 3D Human Pose Estimation 16 Action Recognition In Videos 16 Skeleton Based Action Recognition 16 Activity Recognition 15 DeepFake Detection 15 Video Prediction 15 Video Object Segmentation 14 Person Re-Identification 13 Video Summarization 13 Action Segmentation 12 Instance Segmentation 12 Sign Language Translation 12 Zero-Shot Video Question Answer 12 2D Object Detection 11 3D Pose Estimation 11 Anomaly Detection 11 Facial Expression Recognition (FER) 11 Video Generation 11 3D Object Detection 10 3D Reconstruction 10 Depth Estimation 10 Trajectory Prediction 10 Video Object Tracking 10 Video Segmentation 10 Autonomous Driving 9 Human-Object Interaction Detection 9 Multiple Object Tracking 9 Optical Flow Estimation 9 Text-to-Video Generation 9 Video Anomaly Detection 9 Video Frame Interpolation 9 3D Action Recognition 8 Activity Detection 8 Ad-hoc video search 8 Dense Video Captioning 8 Face Anti-Spoofing 8 Video Grounding 8 Video Object Detection 8 Video Quality Assessment 8 Video Super-Resolution 8 2D Semantic Segmentation 7 Audio Classification 7 Audio-Visual Speech Recognition 7 Emotion Recognition 7 Face Recognition 7 Face Verification 7 Hand Gesture Recognition 7 Lipreading 7 Multi-Task Learning 7 Online Multi-Object Tracking 7 Panoptic Segmentation 7 Pose Tracking 7 Self-Supervised Learning 7 Semi-Supervised Video Object Segmentation 7 Speech Recognition 7 Text to Video Retrieval 7 Unsupervised Video Object Segmentation 7 Video Recognition 7 Zero-Shot Learning 7 Zero-Shot Video Retrieval 7 2D Human Pose Estimation 6 3D Hand Pose Estimation 6 3D Object Tracking 6 Action Quality Assessment 6 Emotion Classification 6 Emotion Recognition in Conversation 6 Face Swapping 6 Hand Pose Estimation 6 Human Pose Forecasting 6 Lane Detection 6 Novel View Synthesis 6 Pose Prediction 6 Scene Understanding 6 Spatio-Temporal Action Localization 6 Video Inpainting 6 Video Instance Segmentation 6 Zero-Shot Action Recognition 6 2D Pose Estimation 5 Anomaly Detection In Surveillance Videos 5 Decision Making 5 Domain Adaptation 5 Face Detection 5 Few Shot Action Recognition 5 Gaze Estimation 5 Human Detection 5 Human action generation 5 Image Classification 5 Moment Retrieval 5 Multimodal Activity Recognition 5 Multimodal Sentiment Analysis 5 Object Localization 5 Object Recognition 5 Text Generation 5 Video Denoising 5 Video Description 5 Video Restoration 5 Video-Text Retrieval 5 motion prediction 5 3D Absolute Human Pose Estimation 4 3D Classification 4 3D Human Reconstruction 4 6D Pose Estimation 4 Action Anticipation 4 Action Triplet Recognition 4 Action Understanding 4 Audio-Visual Synchronization 4 Autonomous Vehicles 4 Crowd Counting 4 Deblurring 4 Disentanglement 4 Facial Emotion Recognition 4 Gesture Recognition 4 Gloss-free Sign Language Translation 4 Image Generation 4 Image Inpainting 4 Interactive Video Object Segmentation 4 Lip Reading 4 Monocular Depth Estimation 4 Motion Segmentation 4 Multi-Label Classification 4 Multi-Label Learning 4 Multimodal Emotion Recognition 4 Pedestrian Detection 4 Person Search 4 Point Tracking 4 Real-Time Multi-Object Tracking 4 Real-Time Object Detection 4 Self-Supervised Action Recognition 4 Speech Emotion Recognition 4 Temporal Action Proposal Generation 4 Trajectory Forecasting 4 Unsupervised Domain Adaptation 4 Unsupervised Object Segmentation 4 Video Compression 4 Video Emotion Recognition 4 Video Enhancement 4 Video Reconstruction 4 Video Semantic Segmentation 4 Video-Adverb Retrieval 4 Visual Speech Recognition 4 Weakly Supervised Action Localization 4 3D Lane Detection 3 Abnormal Event Detection In Video 3 Action Localization 3 Camera shot boundary detection 3 Classification 3 Domain Generalization 3 Early Action Prediction 3 Event Detection 3 Face Identification 3 Face Presentation Attack Detection 3 Facial Action Unit Detection 3 Gait Recognition 3 Gaze Prediction 3 Generalized Zero Shot skeletal action recognition 3 Genre classification 3 Heart rate estimation 3 Human Activity Recognition 3 Human Interaction Recognition 3 Image Clustering 3 Image Quality Assessment 3 Image Retrieval 3 Image Super-Resolution 3 Lightweight Face Recognition 3 Medical Image Segmentation 3 Motion Forecasting 3 Multimodal Deep Learning 3 Multiple Instance Learning 3 Natural Language Moment Retrieval 3 Online Action Detection 3 Pose Retrieval 3 Quantization 3 Referring Expression Segmentation 3 Robot Navigation 3 Self-Supervised Action Recognition Linear 3 Simultaneous Localization and Mapping 3 Small Object Detection 3 Speech Enhancement 3 Speech Separation 3 Surgical tool detection 3 Temporal Forgery Localization 3 Text Summarization 3 Unconstrained Lip-synchronization 3 Unsupervised Anomaly Detection 3 Unsupervised Person Re-Identification 3 Unsupervised Skeleton Based Action Recognition 3 Video Alignment 3 Video Emotion Detection 3 Video Polyp Segmentation 3 Video Saliency Prediction 3 Video Salient Object Detection 3 Video-Adverb Retrieval (Unseen Compositions) 3 Visual Keyword Spotting 3 Visual Odometry 3 Zero Shot Skeletal Action Recognition 3 Zeroshot Video Question Answer 3 audio-visual learning 3 3D Depth Estimation 2 3D Human Pose Tracking 2 3D Object Classification 2 3D Object Detection From Stereo Images 2 3D Object Recognition 2 3D Shape Reconstruction 2 6D Pose Estimation using RGB 2 6D Pose Estimation using RGBD 2 Accident Anticipation 2 Action Parsing 2 Action Spotting 2 Action Unit Detection 2 Active Learning 2 Active Speaker Localization 2 Activity Prediction 2 Activity Recognition In Videos 2 Amodal Instance Segmentation 2 Animal Action Recognition 2 Animal Pose Estimation 2 Arousal Estimation 2 Atomic action recognition 2 Audio-Visual Active Speaker Detection 2 Automatic Speech Recognition (ASR) 2 Bayesian Inference 2 Boundary Captioning 2 Boundary Detection 2 Boundary Grounding 2 Camouflaged Object Segmentation 2 Causal Discovery in Video Reasoning 2 Colorectal Polyps Characterization 2 Copy Detection 2 Cross-Modal Retrieval 2 Cross-domain 3D Human Pose Estimation 2 Denoising 2 Depression Detection 2 Dialogue Act Classification 2 Driver Attention Monitoring 2 Egocentric Activity Recognition 2 Event Segmentation 2 Face Alignment 2 Facial Expression Recognition 2 Facial Landmark Detection 2 Few Shot Temporal Action Localization 2 Few-Shot Learning 2 Fine-Grained Action Detection 2 Generalizable Novel View Synthesis 2 Generic Event Boundary Detection 2 Group Activity Recognition 2 Homography Estimation 2 Human Part Segmentation 2 Human motion prediction 2 Indoor Localization 2 Information Retrieval 2 Interactive Segmentation 2 Kinematic Based Workflow Recognition 2 Landmark-based Lipreading 2 Lip to Speech Synthesis 2 Long-tail Learning 2 Metric Learning 2 Monocular Visual Odometry 2 Motion Estimation 2 Motion Synthesis 2 Multi-Animal Tracking with identification 2 Multi-Hypotheses 3D Human Pose Estimation 2 Multi-Object Tracking and Segmentation 2 Multi-Person Pose Estimation 2 Multi-future Trajectory Prediction 2 Multi-object discovery 2 Multiple Object Tracking with Transformer 2 Multiple People Tracking 2 Multiview Learning 2 Music Generation 2 Music Information Retrieval 2 Music Recommendation 2 Natural Language Queries 2 Natural Language Visual Grounding 2 Neural Rendering 2 Object Counting 2 Object Discovery 2 Open Vocabulary Action Recognition 2 Open World Object Detection 2 Partially Relevant Video Retrieval 2 Person Identification 2 Person Recognition 2 Photoplethysmography (PPG) heart rate estimation 2 Physical Simulations 2 Real-Time Semantic Segmentation 2 Repetitive Action Counting 2 Retrieval 2 Robust Object Detection 2 Scene Change Detection 2 Scene Flow Estimation 2 Scene Graph Detection 2 Scene Graph Generation 2 Scene Text Recognition 2 Self-supervised Skeleton-based Action Recognition 2 Self-supervised Video Retrieval 2 Semantic Object Interaction Classification 2 Semi-Supervised Action Detection 2 Sentiment Analysis 2 Sign Language Production 2 Skills Assessment 2 Skills Evaluation 2 Speaker Recognition 2 Speaker Verification 2 Steering Control 2 Stereo Matching 2 Supervised Video Summarization 2 Surgical Gesture Recognition 2 Talking Face Generation 2 Text-to-Image Generation 2 Text-to-video search 2 Thermal Infrared Object Tracking 2 Traffic Accident Detection 2 Unsupervised 3D Human Pose Estimation 2 Unsupervised Human Pose Estimation 2 Unsupervised Video Summarization 2 Valence Estimation 2 Vehicle Re-Identification 2 Video Domain Adapation 2 Video Matting 2 Video Panoptic Segmentation 2 Video Saliency Detection 2 Video Story QA 2 Video Synchronization 2 Video Visual Relation Detection 2 Video Visual Relation Tagging 2 Video scene graph generation 2 Video to Text Retrieval 2 Video-Based Person Re-Identification 2 Video-based Generative Performance Benchmarking 2 Video-to-image Affordance Grounding 2 Vision and Language Navigation 2 Visual Reasoning 2 Weakly Supervised Action Segmentation (Transcript) 2 Weakly Supervised Object Detection 2 Weakly Supervised Temporal Action Localization 2 Weakly-supervised 3D Human Pose Estimation 2 Weather Forecasting 2 Zero-Shot Action Detection 2 Zero-Shot Composed Image Retrieval (ZS-CIR) 2 Zero-Shot Object Detection 2 Zero-Shot Video-Audio Retrieval 2 Zero-shot dense video captioning 2 drone-based object tracking 2 zero-shot long video question answering 2 2D Semantic Segmentation task 3 (25 classes) 1 3D Anomaly Detection 1 3D Car Instance Understanding 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Feature Matching 1 3D Generation 1 3D Geometry Perception 1 3D Human Dynamics 1 3D Human Pose Estimation in Limited Data 1 3D Human Shape Estimation 1 3D Instance Segmentation 1 3D Multi-Object Tracking 1 3D Object Detection From Monocular Images 1 3D Object Reconstruction 1 3D Object Reconstruction From A Single Image 1 3D Object Retrieval 1 3D Point Cloud Matching 1 3D Point Cloud Reconstruction 1 3D Scene Reconstruction 1 3D Shape Representation 1 Abstractive Text Summarization 1 Action Triplet Detection 1 Active Object Detection 1 Active Speaker Detection 1 Activeness Detection 1 Add - PO 1 Add - PQ 1 Aesthetics Quality Assessment 1 Age Estimation 1 Amodal Panoptic Segmentation 1 Amodal Tracking 1 Analog Video Restoration 1 Anomaly Classification 1 Anomaly Forecasting 1 Anxiety Detection 1 Atari Games 1 Attribute 1 Audio Emotion Recognition 1 Audio Generation 1 Audio Source Separation 1 Audio-visual Question Answering 1 Beam Prediction 1 Binarization 1 Blind Image Quality Assessment 1 Box-supervised Instance Segmentation 1 Breast Cancer Detection 1 Breast Tumour Classification 1 Camera Auto-Calibration 1 Camera Calibration 1 Camera Localization 1 Camera Pose Estimation 1 Camera Relocalization 1 Camera shot segmentation 1 Causal Discovery 1 Cell Segmentation 1 Change Detection 1 Clinical Concept Extraction 1 Collision Avoidance 1 Color Mismatch Correction 1 Colorectal Gland Segmentation: 1 Colorization 1 Commonsense Causal Reasoning 1 Composed Image Retrieval (CoIR) 1 Composed Video Retrieval (CoVR) 1 Composite action recognition 1 Conditional Image Generation 1 Continual Learning 1 Contrastive Learning 1 Conversational Response Generation 1 Conversational Web Navigation 1 Counterfactual Planning 1 Counterfactual Reasoning 1 Data Augmentation 1 Deep Attention 1 Defect Detection 1 Depth And Camera Motion 1 Depth Completion 1 Descriptive 1 Dialog Act Classification 1 Dialogue Evaluation 1 Dialogue Generation 1 Dimensionality Reduction 1 Disparity Estimation 1 Dominance Estimation 1 Drivable Area Detection 1 Drone Pose Estimation 1 Dynamic Facial Expression Recognition 1 ENF (Electric Network Frequency) Extraction from Video 1 Embodied Question Answering 1 Emotional Dialogue Acts 1 English Conversational Speech Recognition 1 Event-based vision 1 Face Clustering 1 Face Generation 1 Face Image Retrieval 1 Facial Attribute Classification 1 Facial Expression Translation 1 Facial Inpainting 1 Facial expression generation 1 Fact Checking 1 Few Shot Open Set Object Detection 1 Few-Shot Image Classification 1 Few-Shot Object Detection 1 Few-Shot Skeleton-Based Action Recognition 1 Few-shot Video Question Answering 1 Fill Mask 1 Fine-Grained Vehicle Classification 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Fire Detection 1 Future Hand Prediction 1 Future prediction 1 GZSL Video Classification 1 Gait Identification 1 Game State Reconstruction 1 Garment Reconstruction 1 Gaze Target Estimation 1 Gender Prediction 1 General Action Video Anomaly Detection 1 General Classification 1 Generalizable Person Re-identification 1 Generalized Zero-Shot Object Detection 1 Gesture Generation 1 Gesture Synchronization 1 Global 3D Human Pose Estimation 1 Graph Matching 1 Grounded Video Question Answering 1 Group Anomaly Detection 1 Hand Detection 1 Hand Joint Reconstruction 1 Hand Segmentation 1 Hand-Gesture Recognition 1 audio-visual event localization 1 eXtreme-Video-Frame-Interpolation 1 hand-object pose 1
Filter by Language
English 306 Chinese 24 German 8 French 5 Portuguese 5 American Sign Language 4 Hindi 4 Russian 4 Spanish 4 Japanese 3 Italian 2 Korean 2 Multilingual 2 Turkish Sign Language 2 Arabic 1 Argentine Sign Language 1 Bengali 1 French Sign Language 1 Greek 1 Greek Sign Language 1 Indonesian 1 Mandarin Chinese 1 Swiss German 1 Swiss-German Sign Language 1 Telugu 1 Turkish 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Afrikaans 0 Akan 0 Akkadian 0 Akuntsu 0 Albanian 0 Ambonese Malay 0 Amharic 0 Ancient Greek 0 Ancient Hebrew 0 Apurinã 0 Aragonese 0 Armenian 0 Arpitan 0 Assamese 0 Assyrian Neo-Aramaic 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bambara 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Basque 0 Bavarian 0 Belarusian 0 Bemba (Zambia) 0 Bhojpuri 0 Bishnupriya 0 Bislama 0 Bodo (India) 0 Bosnian 0 Breton 0 Buginese 0 Bulgarian 0 Burmese 0 Catalan 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Chukot 0 Church Slavic 0 Chuvash 0 Congo Swahili 0 Coptic 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Croatian 0 Czech 0 Danish 0 Dhivehi 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dutch 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Erzya 0 Esperanto 0 Estonian 0 Ewe 0 Extremaduran 0 Faroese 0 Fiji Hindi 0 Fijian 0 Filipino 0 Finnish 0 Fon 0 Friulian 0 Fulah 0 Gagauz 0 Galician 0 Gan Chinese 0 Ganda 0 Geez 0 Georgian 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Gothic 0 Guarani 0 Gujarati 0 Gulf Arabic 0 Haitian 0 Hakha Chin 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Hebrew 0 Herero 0 Hiri Motu 0 Hungarian 0 Icelandic 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Irish 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kabyle 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kannada 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Karelian 0 Kashmiri 0 Kashubian 0 Kazakh 0 Khunsari 0 Kikuyu 0 Kinyarwanda 0 Kirghiz 0 Komi 0 Komi-Permyak 0 Komi-Zyrian 0 Kongo 0 Krio 0 Kuanyama 0 Kupang Malay 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Latin 0 Latvian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Lingua Franca 0 Literary Chinese 0 Lithuanian 0 Livvi 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Lozi 0 Lunda 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Makasar 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayalam 0 Malayic Dayak 0 Maltese 0 Manipuri 0 Manx 0 Maori 0 Marathi 0 Marshallese 0 Mazanderani 0 Mbyá Guaraní 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek 0 Modern Greek (1453-) 0 Moksha 0 Mongolian 0 Moroccan Arabic 0 Mossi 0 Mundurukú 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Nayini 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 Nigerian Pidgin 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Kurdish 0 Northern Luri 0 Northern Sami 0 Northern Uzbek 0 Norwegian 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Nyanja 0 Occitan (post 1500) 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old French 0 Old Russian 0 Old Spanish 0 Old Turkish 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Persian 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Polish 0 Pontic 0 Portuguse 0 Punjabi 0 Pushto 0 Quechua 0 Rajasthani 0 Romanian 0 Romansh 0 Rundi 0 Russia Buriat 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Sanskrit 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Scottish Gaelic 0 Serbian 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Skolt Sami 0 Slovak 0 Slovenian 0 Soi 0 Somali 0 South Azerbaijani 0 South Levantine Arabic 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swedish 0 Swedish Sign Language 0 Tagalog 0 Tahitian 0 Tai 0 Tajik 0 Tamil 0 Tatar 0 Tetum 0 Thai 0 Thai Song 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tonga (Zambia) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tunisian Sign Language 0 Tupinambá 0 Turkmen 0 Tuvinian 0 Twi 0 Uab Meto 0 Udmurt 0 Uighur 0 Ukrainian 0 Upper Sorbian 0 Urdu 0 Uzbek 0 Venda 0 Venetian 0 Veps 0 Vietnamese 0 Vlaams 0 Vlax Romani 0 Volapük 0 Votic 0 Walloon 0 Waray (Philippines) 0 Warlpiri 0 Welsh 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wolof 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Yoruba 0 Yue Chinese 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

944 dataset results for Videos