Cascaded Boundary Regression for Temporal Action Detection

2 May 2017  ·  Jiyang Gao, Zhenheng Yang, Ram Nevatia ·

Temporal action detection in long videos is an important problem. State-of-the-art methods address this problem by applying action classifiers on sliding windows. Although sliding windows may contain an identifiable portion of the actions, they may not necessarily cover the entire action instance, which would lead to inferior performance. We adapt a two-stage temporal action detection pipeline with Cascaded Boundary Regression (CBR) model. Class-agnostic proposals and specific actions are detected respectively in the first and the second stage. CBR uses temporal coordinate regression to refine the temporal boundaries of the sliding windows. The salient aspect of the refinement process is that, inside each stage, the temporal boundaries are adjusted in a cascaded way by feeding the refined windows back to the system for further boundary refinement. We test CBR on THUMOS-14 and TVSeries, and achieve state-of-the-art performance on both datasets. The performance gain is especially remarkable under high IoU thresholds, e.g. map@tIoU=0.5 on THUMOS-14 is improved from 19.0% to 31.0%.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Temporal Action Localization THUMOS’14 CBR-TS mAP IOU@0.5 31 # 31
mAP IOU@0.1 60.1 # 6
mAP IOU@0.2 56.7 # 6
mAP IOU@0.3 50.1 # 29
mAP IOU@0.4 41.3 # 29
mAP IOU@0.6 19.1 # 26
mAP IOU@0.7 9.9 # 26

Methods


No methods listed for this paper. Add relevant methods here