Temporal Action Localization

239 papers with code • 8 benchmarks • 37 datasets

Temporal Action Localization aims to detect activities in the video stream and output beginning and end timestamps. It is closely related to Temporal Action Proposal Generation.

Greatest papers with code

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

tensorflow/models 22 Apr 2021

We show that the convolution-free VATT outperforms state-of-the-art ConvNet-based architectures in the downstream tasks.

 Ranked #1 on Action Classification on Moments in Time (using extra training data)

Action Classification Action Recognition +6

MoViNets: Mobile Video Networks for Efficient Video Recognition

tensorflow/models CVPR 2021

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Action Classification Action Recognition +2

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

tensorflow/models CVPR 2018

The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1. 58M action labels with multiple labels per person occurring frequently.

Action Recognition Video Understanding

Large-scale weakly-supervised pre-training for video action recognition

microsoft/computervision-recipes CVPR 2019

Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning?

 Ranked #1 on Egocentric Activity Recognition on EPIC-KITCHENS-55 (Actions Top-1 (S2) metric)

Action Classification Action Recognition +3

A Closer Look at Spatiotemporal Convolutions for Action Recognition

microsoft/computervision-recipes CVPR 2018

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition.

Action Classification Action Recognition

BMN: Boundary-Matching Network for Temporal Action Proposal Generation

PaddlePaddle/models ICCV 2019

To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.

Action Detection Action Recognition +1

BSN: Boundary Sensitive Network for Temporal Action Proposal Generation

PaddlePaddle/models ECCV 2018

Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content.

Action Detection Temporal Action Proposal Generation

Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

jacobgil/pytorch-grad-cam 30 Oct 2017

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems.

3D Action Recognition Knowledge Distillation +1

Hide-and-Seek: A Data Augmentation Technique for Weakly-Supervised Localization and Beyond

PaddlePaddle/PaddleClas 6 Nov 2018

Our approach only needs to modify the input image and can work with any network to improve its performance.

Data Augmentation Emotion Recognition +5