About

Please note some benchmarks may be located in the Action Classification or Video Classification tasks, e.g. Kinetics-400.

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Subtasks

Datasets

Greatest papers with code

MoViNets: Mobile Video Networks for Efficient Video Recognition

21 Mar 2021tensorflow/models

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

ACTION CLASSIFICATION ACTION RECOGNITION NEURAL ARCHITECTURE SEARCH VIDEO RECOGNITION

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

CVPR 2018 tensorflow/models

The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1. 58M action labels with multiple labels per person occurring frequently.

ACTION RECOGNITION VIDEO UNDERSTANDING

Non-local Neural Networks

CVPR 2018 facebookresearch/detectron

Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.

Ranked #8 on Keypoint Detection on COCO (Validation AP metric)

ACTION CLASSIFICATION ACTION RECOGNITION INSTANCE SEGMENTATION KEYPOINT DETECTION OBJECT DETECTION

View-Invariant, Occlusion-Robust Probabilistic Embedding for Human Pose

23 Oct 2020google-research/google-research

We further show that keypoint occlusion augmentation during training significantly improves retrieval performance on partial 2D input poses.

3D POSE ESTIMATION ACTION RECOGNITION VIDEO ALIGNMENT

Unsupervised Learning of Object Structure and Dynamics from Videos

NeurIPS 2019 google-research/google-research

Extracting and predicting object structure and dynamics from videos without supervision is a major challenge in machine learning.

ACTION RECOGNITION CONTINUOUS CONTROL OBJECT TRACKING VIDEO PREDICTION

Large-scale weakly-supervised pre-training for video action recognition

CVPR 2019 microsoft/computervision-recipes

Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning?

 Ranked #1 on Egocentric Activity Recognition on EPIC-KITCHENS-55 (Actions Top-1 (S2) metric)

ACTION CLASSIFICATION ACTION RECOGNITION ACTIVITY RECOGNITION IN VIDEOS EGOCENTRIC ACTIVITY RECOGNITION TRANSFER LEARNING

A Closer Look at Spatiotemporal Convolutions for Action Recognition

CVPR 2018 microsoft/computervision-recipes

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition.

ACTION CLASSIFICATION ACTION RECOGNITION

BMN: Boundary-Matching Network for Temporal Action Proposal Generation

ICCV 2019 PaddlePaddle/models

To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map.

ACTION DETECTION ACTION RECOGNITION TEMPORAL ACTION PROPOSAL GENERATION