Large-scale weakly-supervised pre-training for video action recognition

CVPR 2019 Deepti GhadiyaramMatt FeiszliDu TranXueting YanHeng WangDhruv Mahajan

Current fully-supervised video datasets consist of only a few hundred thousand videos and fewer than a thousand domain-specific labels. This hinders the progress towards advanced video architectures... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT LEADERBOARD
Egocentric Activity Recognition EPIC-Kitchens R(2+1)D-152-SE (ig) Actions Top-1 (S2) 25.6 # 1
Egocentric Activity Recognition EPIC-Kitchens R(2+1)D-34 (kinetics) Actions Top-1 (S2) 16.8 # 4
Action Classification Kinetics-400 irCSN-152 (IG-Kinetics-65M pretrain) Accuracy 82.8 # 2
Action Recognition In Videos Kinetics-400 R(2+1)D-152* Video [email protected] 81.3 # 1
Video [email protected] 95.1 # 1