Datasets

8,918 machine learning datasets
Filter by Task
Action Recognition 73 Temporal Action Localization 30 Video Understanding 30 Object Tracking 29 Object Detection 28 Pose Estimation 28 Video Captioning 26 Video Retrieval 26 Video Question Answering 25 Action Detection 23 Multi-Object Tracking 23 Semantic Segmentation 22 Action Classification 20 Question Answering 19 Visual Question Answering (VQA) 18 Visual Object Tracking 17 Visual Tracking 17 Skeleton Based Action Recognition 16 Action Recognition In Videos 15 Activity Recognition 15 Video Classification 15 Video Prediction 15 3D Human Pose Estimation 14 Video Object Segmentation 14 DeepFake Detection 13 Sign Language Recognition 13 Person Re-Identification 12 Action Segmentation 11 Facial Expression Recognition (FER) 11 Instance Segmentation 11 Anomaly Detection 10 Sign Language Translation 10 Trajectory Prediction 10 Video Generation 10 3D Pose Estimation 9 3D Reconstruction 9 Autonomous Driving 9 Depth Estimation 9 Human-Object Interaction Detection 9 Multiple Object Tracking 9 Video Segmentation 9 Video Summarization 9 Activity Detection 8 Face Anti-Spoofing 8 Optical Flow Estimation 8 Video Object Detection 8 Video Object Tracking 8 2D Object Detection 7 3D Action Recognition 7 3D Object Detection 7 Ad-hoc video search 7 Audio Classification 7 Multi-Task Learning 7 Panoptic Segmentation 7 Self-Supervised Learning 7 Semi-Supervised Video Object Segmentation 7 Unsupervised Video Object Segmentation 7 Video Quality Assessment 7 Zero-Shot Video Question Answer 7 3D Hand Pose Estimation 6 Action Quality Assessment 6 Audio-Visual Speech Recognition 6 Emotion Recognition 6 Emotion Recognition in Conversation 6 Face Recognition 6 Face Swapping 6 Hand Gesture Recognition 6 Hand Pose Estimation 6 Scene Understanding 6 Speech Recognition 6 Text-to-Video Generation 6 Unsupervised Domain Adaptation 6 Video Anomaly Detection 6 Video Frame Interpolation 6 Video Instance Segmentation 6 Zero-Shot Action Recognition 6 Zero-Shot Video Retrieval 6 2D Human Pose Estimation 5 2D Semantic Segmentation 5 3D Object Tracking 5 Anomaly Detection In Surveillance Videos 5 Decision Making 5 Dense Video Captioning 5 Domain Adaptation 5 Face Verification 5 Lipreading 5 Moment Retrieval 5 Multimodal Activity Recognition 5 Novel View Synthesis 5 Object Localization 5 Object Recognition 5 Pose Prediction 5 Pose Tracking 5 Spatio-Temporal Action Localization 5 Text to Video Retrieval 5 Video Inpainting 5 Video Recognition 5 Video Super-Resolution 5 Video-Text Retrieval 5 3D Absolute Human Pose Estimation 4 Action Anticipation 4 Action Triplet Recognition 4 Action Understanding 4 Crowd Counting 4 Deblurring 4 Disentanglement 4 Emotion Classification 4 Face Detection 4 Facial Emotion Recognition 4 Gesture Recognition 4 Human Detection 4 Image Classification 4 Image Generation 4 Interactive Video Object Segmentation 4 Lane Detection 4 Motion Segmentation 4 Multimodal Sentiment Analysis 4 Online Multi-Object Tracking 4 Person Search 4 Real-Time Object Detection 4 Self-Supervised Action Recognition 4 Temporal Action Proposal Generation 4 Text Generation 4 Trajectory Forecasting 4 Unsupervised Object Segmentation 4 Video Description 4 Video Semantic Segmentation 4 Video-Adverb Retrieval 4 Visual Speech Recognition 4 Weakly Supervised Action Localization 4 Zero-Shot Learning 4 motion prediction 4 2D Pose Estimation 3 3D Classification 3 3D Human Reconstruction 3 3D Lane Detection 3 Abnormal Event Detection In Video 3 Action Localization 3 Autonomous Vehicles 3 Camera shot boundary detection 3 Classification 3 Face Presentation Attack Detection 3 Facial Action Unit Detection 3 Few Shot Action Recognition 3 Gait Recognition 3 Gaze Estimation 3 Genre classification 3 Human Pose Forecasting 3 Human action generation 3 Image Inpainting 3 Image Retrieval 3 Lip Reading 3 Medical Image Segmentation 3 Monocular Depth Estimation 3 Motion Forecasting 3 Multi-Label Classification 3 Multi-Label Learning 3 Multimodal Deep Learning 3 Multimodal Emotion Recognition 3 Multiple Instance Learning 3 Natural Language Moment Retrieval 3 Online Action Detection 3 Pedestrian Detection 3 Quantization 3 Real-Time Multi-Object Tracking 3 Referring Expression Segmentation 3 Robot Navigation 3 Small Object Detection 3 Speech Emotion Recognition 3 Speech Separation 3 Temporal Forgery Localization 3 Unconstrained Lip-synchronization 3 Unsupervised Anomaly Detection 3 Unsupervised Person Re-Identification 3 Video Alignment 3 Video Denoising 3 Video Emotion Recognition 3 Video Grounding 3 Video Polyp Segmentation 3 Video Reconstruction 3 Video Restoration 3 Video Salient Object Detection 3 Video-Adverb Retrieval (Unseen Compositions) 3 Visual Keyword Spotting 3 3D Human Pose Tracking 2 3D Object Classification 2 3D Object Detection From Stereo Images 2 3D Object Recognition 2 3D Shape Reconstruction 2 6D Pose Estimation 2 6D Pose Estimation using RGB 2 6D Pose Estimation using RGBD 2 Accident Anticipation 2 Action Parsing 2 Action Spotting 2 Action Unit Detection 2 Active Learning 2 Active Speaker Localization 2 Activity Prediction 2 Activity Recognition In Videos 2 Animal Action Recognition 2 Animal Pose Estimation 2 Arousal Estimation 2 Atomic action recognition 2 Audio-Visual Active Speaker Detection 2 Audio-Visual Synchronization 2 Automatic Speech Recognition (ASR) 2 Bayesian Inference 2 Boundary Detection 2 Camouflaged Object Segmentation 2 Class-agnostic Object Detection 2 Colorectal Polyps Characterization 2 Copy Detection 2 Cross-domain 3D Human Pose Estimation 2 Denoising 2 Depression Detection 2 Dialogue Act Classification 2 Domain Generalization 2 Driver Attention Monitoring 2 Early Action Prediction 2 Egocentric Activity Recognition 2 Event Detection 2 Event Segmentation 2 Face Alignment 2 Face Identification 2 Facial Landmark Detection 2 Few Shot Temporal Action Localization 2 Few-Shot Learning 2 Fine-Grained Action Detection 2 Gaze Prediction 2 Group Activity Recognition 2 Heart rate estimation 2 Homography Estimation 2 Human Activity Recognition 2 Human Part Segmentation 2 Human motion prediction 2 Image Quality Assessment 2 Image Super-Resolution 2 Information Retrieval 2 Interactive Segmentation 2 Kinematic Based Workflow Recognition 2 Lip to Speech Synthesis 2 Metric Learning 2 Motion Estimation 2 Motion Synthesis 2 Multi-Animal Tracking with identification 2 Multi-Hypotheses 3D Human Pose Estimation 2 Multi-Object Tracking and Segmentation 2 Multi-Person Pose Estimation 2 Multi-future Trajectory Prediction 2 Multi-object discovery 2 Multiple People Tracking 2 Multiview Learning 2 Music Information Retrieval 2 Natural Language Queries 2 Natural Language Visual Grounding 2 Neural Rendering 2 Object Counting 2 One-Shot 3D Action Recognition 2 Open World Object Detection 2 Person Identification 2 Person Recognition 2 Point Tracking 2 Pose Retrieval 2 Real-Time Semantic Segmentation 2 Retrieval 2 Robust Object Detection 2 Scene Change Detection 2 Scene Flow Estimation 2 Scene Text Recognition 2 Self-Supervised Action Recognition Linear 2 Self-supervised Video Retrieval 2 Semantic Object Interaction Classification 2 Semi-Supervised Action Detection 2 Sentiment Analysis 2 Simultaneous Localization and Mapping 2 Skills Assessment 2 Skills Evaluation 2 Speaker Recognition 2 Speaker Verification 2 Speech Enhancement 2 Steering Control 2 Stereo Matching 2 Supervised Video Summarization 2 Surgical Gesture Recognition 2 Surgical tool detection 2 Talking Face Generation 2 Text Summarization 2 Text-to-video search 2 Thermal Infrared Object Tracking 2 Traffic Accident Detection 2 Unsupervised 3D Human Pose Estimation 2 Unsupervised Human Pose Estimation 2 Unsupervised Video Summarization 2 Valence Estimation 2 Vehicle Re-Identification 2 Video Compression 2 Video Enhancement 2 Video Matting 2 Video Panoptic Segmentation 2 Video Saliency Prediction 2 Video Synchronization 2 Video Visual Relation Detection 2 Video Visual Relation Tagging 2 Video-Based Person Re-Identification 2 Video-to-image Affordance Grounding 2 Visual Odometry 2 Visual Reasoning 2 Weakly Supervised Action Segmentation (Transcript) 2 Weakly Supervised Object Detection 2 Weakly Supervised Temporal Action Localization 2 Weakly-supervised 3D Human Pose Estimation 2 Weakly-supervised Temporal Action Localization 2 Weather Forecasting 2 Zero-Shot Action Detection 2 Zero-Shot Composed Image Retrieval (ZS-CIR) 2 Zero-Shot Object Detection 2 audio-visual learning 2 object-detection 2 2D Semantic Segmentation task 3 (25 classes) 1 3D Anomaly Detection 1 3D Car Instance Understanding 1 3D Depth Estimation 1 3D Face Reconstruction 1 3D Facial Expression Recognition 1 3D Feature Matching 1 3D Geometry Perception 1 3D Human Dynamics 1 3D Human Pose Estimation in Limited Data 1 3D Human Shape Estimation 1 3D Instance Segmentation 1 3D Multi-Object Tracking 1 3D Object Detection From Monocular Images 1 3D Object Reconstruction 1 3D Object Reconstruction From A Single Image 1 3D Object Retrieval 1 3D Point Cloud Matching 1 3D Point Cloud Reconstruction 1 3D Scene Reconstruction 1 3D Shape Representation 1 Abstractive Text Summarization 1 Action Triplet Detection 1 Active Object Detection 1 Activeness Detection 1 Add - PO 1 Add - PQ 1 Aesthetics Quality Assessment 1 Age Estimation 1 Amodal Instance Segmentation 1 Amodal Panoptic Segmentation 1 Analog Video Restoration 1 Anxiety Detection 1 Atari Games 1 Audio Emotion Recognition 1 Audio Generation 1 Audio Source Separation 1 Audio-visual Question Answering 1 Binarization 1 Blind Image Quality Assessment 1 Boundary Captioning 1 Boundary Grounding 1 Box-supervised Instance Segmentation 1 Breast Cancer Detection 1 Breast Tumour Classification 1 Camera Auto-Calibration 1 Camera shot segmentation 1 Cell Segmentation 1 Change Detection 1 Clinical Concept Extraction 1 Color Mismatch Correction 1 Colorectal Gland Segmentation: 1 Colorization 1 Composed Image Retrieval (CoIR) 1 Composed Video Retrieval (CoVR) 1 Composite action recognition 1 Conditional Image Generation 1 Continual Learning 1 Contrastive Learning 1 Conversational Response Generation 1 Counterfactual Planning 1 Cross-Modal Retrieval 1 Data Augmentation 1 Deep Attention 1 Defect Detection 1 Depth And Camera Motion 1 Descriptive 1 Dialog Act Classification 1 Dialogue Evaluation 1 Dialogue Generation 1 Dimensionality Reduction 1 Disparity Estimation 1 Dominance Estimation 1 Drivable Area Detection 1 Drone Pose Estimation 1 Dynamic Facial Expression Recognition 1 Emotional Dialogue Acts 1 English Conversational Speech Recognition 1 Face Clustering 1 Face Generation 1 Facial Attribute Classification 1 Facial Expression Recognition 1 Facial Expression Translation 1 Facial expression generation 1 Fact Checking 1 Few-Shot Image Classification 1 Few-Shot Object Detection 1 Fill Mask 1 Fine-Grained Vehicle Classification 1 Fine-Grained Visual Categorization 1 Fine-Grained Visual Recognition 1 Fine-grained Action Recognition 1 Fire Detection 1 Future Hand Prediction 1 Future prediction 1 Gait Identification 1 Gaze Target Estimation 1 Gender Prediction 1 General Action Video Anomaly Detection 1 General Classification 1 Generalizable Person Re-identification 1 Generalized Zero-Shot Object Detection 1 Gesture Generation 1 Global 3D Human Pose Estimation 1 Gloss-free Sign Language Translation 1 Graph Matching 1 Group Anomaly Detection 1 HD semantic map learning 1 HDR Reconstruction 1 Hand Detection 1 Hand Joint Reconstruction 1 Hand Segmentation 1 Hand-Gesture Recognition 1 Head Pose Estimation 1 Heart Rate Variability 1 Highlight Detection 1 Home Activity Monitoring 1 Human Dynamics 1 Human Instance Segmentation 1 Human Interaction Recognition 1 Human fMRI response prediction 1 Human-Object Interaction Anticipation 1 Human-Object-interaction motion tracking 1 Image Captioning 1 Image Deblurring 1 Image Dehazing 1 Image Denoising 1 Image Generation from Scene Graphs 1 Image Manipulation 1 Image Outpainting 1 Image Registration 1 Image Relighting 1 Image Restoration 1 Image-level Supervised Instance Segmentation 1 Image-to-Text Retrieval 1 Imitation Learning 1 Imputation 1 Indoor Localization 1 Instrument Recognition 1 Inverse-Tone-Mapping 1 Joint Demosaicing and Denoising 1 Keypoint Detection 1 Knowledge Distillation 1 Language Modelling 1 Language-Based Temporal Localization 1 Layout-to-Image Generation 1 Lesion Detection 1 License Plate Detection 1 License Plate Recognition 1 Lip password classification 1 Localization In Video Forgery 1 Logical Reasoning Question Answering 1 Long Term Action Anticipation 1 Long-tail Learning 1 Long-video Activity Recognition 1 Low-Light Image Enhancement 1 Markerless Motion Capture 1 Medical Diagnosis 1 Medical Image Registration 1 Medical Object Detection 1 Metaheuristic Optimization 1 Micro Expression Recognition 1 Micro-Expression Spotting 1 Misinformation 1 Mistake Detection 1 Moment Queries 1 Monocular 3D Human Pose Estimation 1 Monocular 3D Object Detection 1 Monocular Cross-View Road Scene Parsing(Road) 1 Monocular Cross-View Road Scene Parsing(Vehicle) 1 Monocular Visual Odometry 1 Motion Disentanglement 1 Moving Object Detection 1 Multi Future Trajectory Prediction 1 Multi-Frame Super-Resolution 1 Multi-Instance Retrieval 1 Multi-agent Reinforcement Learning 1 audio-visual event localization 1 drone-based object tracking 1 eXtreme-Video-Frame-Interpolation 1
Filter by Language
English 221 Chinese 18 German 8 French 4 Hindi 4 Spanish 4 American Sign Language 3 Portuguese 3 Russian 3 Japanese 2 Korean 2 Multilingual 2 Turkish Sign Language 2 Arabic 1 Argentine Sign Language 1 Bengali 1 Greek 1 Greek Sign Language 1 Indonesian 1 Italian 1 Mandarin Chinese 1 Swiss German 1 Swiss-German Sign Language 1 Telugu 1 Turkish 1 Abkhazian 0 Achinese 0 Adyghe 0 Afar 0 Afrikaans 0 Akan 0 Akkadian 0 Akuntsu 0 Albanian 0 Amharic 0 Ancient Greek 0 Ancient Hebrew 0 Apurinã 0 Aragonese 0 Armenian 0 Arpitan 0 Assamese 0 Assyrian Neo-Aramaic 0 Asturian 0 Avaric 0 Aymara 0 Azerbaijani 0 Bambara 0 Bangala 0 Bangladeshi Sign Language 0 Banjar 0 Bashkir 0 Basque 0 Bavarian 0 Belarusian 0 Bemba (Zambia) 0 Bhojpuri 0 Bishnupriya 0 Bislama 0 Bodo (India) 0 Bosnian 0 Breton 0 Buginese 0 Bulgarian 0 Burmese 0 Catalan 0 Cebuano 0 Central Bikol 0 Central Khmer 0 Central Kurdish 0 Central Pashto 0 Chamorro 0 Chavacano 0 Chechen 0 Cherokee 0 Cheyenne 0 Choctaw 0 Chukot 0 Church Slavic 0 Chuvash 0 Congo Swahili 0 Coptic 0 Cornish 0 Corsican 0 Cree 0 Creek 0 Crimean Tatar 0 Croatian 0 Czech 0 Danish 0 Dhivehi 0 Dimli (individual language) 0 Dogri (individual language) 0 Dogri (macrolanguage) 0 Dutch 0 Dzongkha 0 Eastern Mari 0 Egyptian Arabic 0 Erzya 0 Esperanto 0 Estonian 0 Ewe 0 Extremaduran 0 Faroese 0 Fiji Hindi 0 Fijian 0 Filipino 0 Finnish 0 Fon 0 Friulian 0 Fulah 0 Gagauz 0 Galician 0 Gan Chinese 0 Ganda 0 Geez 0 Georgian 0 German Sign Language 0 Gilaki 0 Goan Konkani 0 Gothic 0 Guarani 0 Gujarati 0 Gulf Arabic 0 Haitian 0 Hakha Chin 0 Hakka Chinese 0 Halh Mongolian 0 Hausa 0 Hawaiian 0 Hebrew 0 Herero 0 Hiri Motu 0 Hungarian 0 Icelandic 0 Ido 0 Igbo 0 Iloko 0 Interlingua (International Auxiliary Language Association) 0 Interlingue 0 Inuktitut 0 Inupiaq 0 Iranian Persian 0 Irish 0 Jamaican Creole English 0 Javanese 0 Jejueo 0 Kabardian 0 Kabuverdianu 0 Kabyle 0 Kachin 0 Kalaallisut 0 Kalmyk 0 Kannada 0 Kanuri 0 Kara-Kalpak 0 Karachay-Balkar 0 Karelian 0 Kashmiri 0 Kashubian 0 Kazakh 0 Khunsari 0 Kikuyu 0 Kinyarwanda 0 Kirghiz 0 Komi 0 Komi-Permyak 0 Komi-Zyrian 0 Kongo 0 Krio 0 Kuanyama 0 Kurdish 0 Kölsch 0 Ladino 0 Lak 0 Lao 0 Latgalian 0 Latin 0 Latvian 0 Lezghian 0 Ligurian 0 Limburgan 0 Lingala 0 Literary Chinese 0 Lithuanian 0 Livvi 0 Lojban 0 Lombard 0 Low German 0 Lower Sorbian 0 Lozi 0 Lunda 0 Luo (Cameroon) 0 Luo (Kenya and Tanzania) 0 Lushai 0 Luxembourgish 0 Macedonian 0 Maithili 0 Malagasy 0 Malay (individual language) 0 Malay (macrolanguage) 0 Malayalam 0 Maltese 0 Manipuri 0 Manx 0 Maori 0 Marathi 0 Marshallese 0 Mazanderani 0 Mbyá Guaraní 0 Mesopotamian Arabic 0 Min Dong Chinese 0 Minangkabau 0 Mingrelian 0 Mirandese 0 Modern Greek 0 Modern Greek (1453-) 0 Moksha 0 Mongolian 0 Moroccan Arabic 0 Mundurukú 0 Najdi Arabic 0 Narom 0 Nauru 0 Navajo 0 Naxi 0 Nayini 0 Ndonga 0 Neapolitan 0 Nepali (individual language) 0 Nepali (macrolanguage) 0 Newari 0 Nigerian Fulfulde 0 Nigerian Pidgin 0 North Azerbaijani 0 North Levantine Arabic 0 Northern Frisian 0 Northern Huishui Hmong 0 Northern Kurdish 0 Northern Luri 0 Northern Sami 0 Northern Uzbek 0 Norwegian 0 Norwegian Bokmål 0 Norwegian Nynorsk 0 Novial 0 Nyanja 0 Occitan (post 1500) 0 Odia 0 Official Aramaic (700-300 BCE) 0 Old English (ca. 450-1100) 0 Old French 0 Old Russian 0 Old Turkish 0 Oriya (macrolanguage) 0 Oromo 0 Ossetian 0 Pali 0 Pampanga 0 Pangasinan 0 Papiamento 0 Pedi 0 Pennsylvania German 0 Persian 0 Pfaelzisch 0 Picard 0 Piemontese 0 Pitcairn-Norfolk 0 Plateau Malagasy 0 Polish 0 Pontic 0 Portuguse 0 Punjabi 0 Pushto 0 Quechua 0 Rajasthani 0 Romanian 0 Romansh 0 Rundi 0 Russia Buriat 0 Rusyn 0 Saidi Arabic 0 Samoan 0 Sango 0 Sanskrit 0 Santali 0 Sardinian 0 Saterfriesisch 0 Scots 0 Scottish Gaelic 0 Serbian 0 Serbo-Croatian 0 Shan 0 Shona 0 Sichuan Yi 0 Sicilian 0 Silesian 0 Sindhi 0 Sinhala 0 Skolt Sami 0 Slovak 0 Slovenian 0 Soi 0 Somali 0 South Azerbaijani 0 South Levantine Arabic 0 Southern Pashto 0 Southern Sotho 0 Sranan Tongo 0 Standard Arabic 0 Standard Latvian 0 Sundanese 0 Swahili 0 Swahili (macrolanguage) 0 Swati 0 Swedish 0 Swedish Sign Language 0 Tagalog 0 Tahitian 0 Tai 0 Tajik 0 Tamil 0 Tatar 0 Tetum 0 Thai 0 Tibetan 0 Tigrinya 0 Tok Pisin 0 Tonga (Tonga Islands) 0 Tonga (Zambia) 0 Tosk Albanian 0 Tsonga 0 Tswana 0 Tulu 0 Tumbuka 0 Tunisian Arabic 0 Tupinambá 0 Turkmen 0 Tuvinian 0 Twi 0 Udmurt 0 Uighur 0 Ukrainian 0 Upper Sorbian 0 Urdu 0 Uzbek 0 Venda 0 Venetian 0 Veps 0 Vietnamese 0 Vlaams 0 Vlax Romani 0 Volapük 0 Votic 0 Walloon 0 Waray (Philippines) 0 Warlpiri 0 Welsh 0 West Central Oromo 0 Western Frisian 0 Western Mari 0 Western Panjabi 0 Wolof 0 Wu Chinese 0 Xhosa 0 Yakut 0 Yiddish 0 Yoruba 0 Yue Chinese 0 Zaza 0 Zeeuws 0 Zhuang 0 Zulu 0

819 dataset results for Videos