Then we apply the GCNs over the graph to model the relations among different proposals and learn powerful representations for the action classification and localization.
Our joint formulation has three terms: a classification term to ensure the separability of learned action features, an adapted multi-label center loss term to enhance the action feature discriminability and a counting loss term to delineate adjacent action sequences, leading to improved localization.
Temporal action localization has recently attracted significant interest in the Computer Vision community.
In this task, every WiFi distortion sample in the whole series should be categorized into one action, which is a critical technique in precise action localization, continuous action segmentation, and real-time action recognition.
Our approach only needs to modify the input image and can work with any network to improve its performance.
In this paper, we first develop a novel weakly-supervised TAL framework called AutoLoc to directly predict the temporal boundary of each action instance.
Second, we propose an actor-based attention mechanism that enables the localization of the actions from action class labels and actor proposals and is end-to-end trainable.
This paper presents a new large-scale dataset for recognition and temporal localization of human actions collected from Web videos.
We propose a weakly supervised temporal action localization algorithm on untrimmed videos using convolutional neural networks.
#2 best model for Weakly Supervised Action Localization on THUMOS 2014
The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1. 58M action labels with multiple labels per person occurring frequently.