Spatio-Temporal Action Localization
13 papers with code • 1 benchmarks • 6 datasets
Latest papers with no code
Submission to ActivityNet Challenge 2019: Task B Spatio-temporal Action Localization
This technical report present an overview of our system proposed for the spatio-temporal action localization(SAL) task in ActivityNet Challenge 2019.
Trimmed Action Recognition, Dense-Captioning Events in Videos, and Spatio-temporal Action Localization with Focus on ActivityNet Challenge 2019
This notebook paper presents an overview and comparative analysis of our systems designed for the following three tasks in ActivityNet Challenge 2019: trimmed action recognition, dense-captioning events in videos, and spatio-temporal action localization.
Improving Action Localization by Progressive Cross-stream Cooperation
Specifically, we first generate a larger set of region proposals by combining the latest region proposals from both streams, from which we can readily obtain a larger set of labelled training samples to help learn better action detection models.
Spatio-Temporal Action Localization in a Weakly Supervised Setting
Enabling computational systems with the ability to localize actions in video-based content has manifold applications.
Spatio-Temporal Instance Learning: Action Tubes from Class Supervision
Rather than disconnecting the spatio-temporal learning from the training, we propose Spatio-Temporal Instance Learning, which enables action localization directly from box proposals in video frames.
YH Technologies at ActivityNet Challenge 2018
This notebook paper presents an overview and comparative analysis of our systems designed for the following five tasks in ActivityNet Challenge 2018: temporal action proposals, temporal action localization, dense-captioning events in videos, trimmed action recognition, and spatio-temporal action localization.
Modeling Spatio-Temporal Human Track Structure for Action Localization
In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks.
Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection
A human action can be seen as transitions between one's body poses over time, where the transition depicts a temporal relation between two poses.
What If We Do Not Have Multiple Videos of the Same Action? -- Video Action Localization Using Web Images
%We reconstruct video action proposals from image action proposals while enforcing consistency across coefficient vectors of multiple frames by consensus regularization.
Learning to track for spatio-temporal action localization
We present experimental results for spatio-temporal localization on the UCF-Sports, J-HMDB and UCF-101 action localization datasets, where our approach outperforms the state of the art with a margin of 15%, 7% and 12% respectively in mAP.