239 papers with code • 8 benchmarks • 37 datasets
Temporal Action Localization aims to detect activities in the video stream and output beginning and end timestamps. It is closely related to Temporal Action Proposal Generation.
We show that the convolution-free VATT outperforms state-of-the-art ConvNet-based architectures in the downstream tasks.
Ranked #1 on Action Classification on Moments in Time (using extra training data)
We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.
Ranked #1 on Action Recognition on EPIC-KITCHENS-100
The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1. 58M action labels with multiple labels per person occurring frequently.
Ranked #2 on Temporal Action Localization on J-HMDB-21
Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning?
Ranked #1 on Egocentric Activity Recognition on EPIC-KITCHENS-55 (Actions Top-1 (S2) metric)
In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition.
Ranked #4 on Action Recognition on Sports-1M
To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.
Ranked #1 on Action Recognition on THUMOS’14
Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content.
Ranked #1 on Temporal Action Proposal Generation on THUMOS' 14