SlowFast Networks for Video Recognition

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution... The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition. Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report state-of-the-art accuracy on major video recognition benchmarks, Kinetics, Charades and AVA. Code has been made available at: https://github.com/facebookresearch/SlowFast read more

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Action Recognition AVA v2.1 SlowFast (Kinetics-400 pretraining) mAP (Val) 26.3 # 7
Action Recognition AVA v2.1 SlowFast++ (Kinetics-600 pretraining, NL) mAP (Val) 28.3 # 2
Action Recognition AVA v2.1 SlowFast (Kinetics-600 pretraining, NL) mAP (Val) 27.3 # 5
Action Recognition AVA v2.1 SlowFast (Kinetics-600 pretraining) mAP (Val) 26.8 # 6
Action Recognition AVA v2.2 SlowFast, 8x8 R101+NL (Kinetics-600 pretraining) mAP 27.1 # 7
Action Recognition AVA v2.2 SlowFast, 4x16, R50 (Kinetics-400 pretraining) mAP 21.9 # 13
Action Classification Charades SlowFast (Kinetics-600 pretraining) MAP 42.1 # 21
Action Classification Charades SlowFast (Kinetics-600 pretraining, NL) MAP 45.2 # 14
Action Classification Charades SlowFast (Kinetics-400 pretraining, NL) MAP 42.5 # 19
Action Classification Kinetics-400 SlowFast 16x8 (ResNet-101 + NL) Vid acc@5 93.9 # 35
Action Classification Kinetics-400 SlowFast 4x16 (ResNet-50) Vid acc@1 75.6 # 79
Vid acc@5 92.1 # 56
Action Classification Kinetics-400 SlowFast 16x8 (ResNet-101 + NL) Vid acc@1 79.8 # 35
Action Classification Kinetics-400 SlowFast 16x8 (ResNet-101) Vid acc@1 78.9 # 48
Vid acc@5 93.5 # 43
Action Classification Kinetics-400 SlowFast 8x8 (ResNet-101) Vid acc@1 77.9 # 57
Vid acc@5 93.2 # 48
Action Classification Kinetics-400 SlowFast 8x8 (ResNet-50) Vid acc@1 77 # 70
Vid acc@5 92.6 # 53
Action Classification Kinetics-600 SlowFast 16x8 (ResNet-101) Top-1 Accuracy 81.1 # 24
Top-5 Accuracy 95.1 # 20
Action Classification Kinetics-600 SlowFast 8x8 (ResNet-101) Top-1 Accuracy 80.4 # 26
Top-5 Accuracy 94.8 # 23
Action Classification Kinetics-600 SlowFast 16x8 (ResNet-101 + NL) Top-1 Accuracy 81.8 # 21
Top-5 Accuracy 95.1 # 20
Action Classification Kinetics-600 SlowFast 8x8 (ResNet-50) Top-1 Accuracy 79.9 # 27
Top-5 Accuracy 94.5 # 24
Action Classification Kinetics-600 SlowFast 4x16 (ResNet-50) Top-1 Accuracy 78.8 # 29
Top-5 Accuracy 94 # 25
Action Recognition Something-Something V2 SlowFast Top-1 Accuracy 61.7 # 44

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Action Recognition AVA v2.2 SlowFast, 8x8, R101 (Kinetics-400 pretraining) mAP 23.8 # 12
Action Recognition AVA v2.2 SlowFast, 16x8 R101+NL (Kinetics-600 pretraining) mAP 27.5 # 4
Action Recognition Diving-48 SlowFast Accuracy 77.6 # 6

Methods


No methods listed for this paper. Add relevant methods here