Browse > Computer Vision > Video > Video Understanding

Video Understanding

24 papers with code · Computer Vision
Subtask of Video

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

CVPR 2018 tensorflow/models

The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1. 58M action labels with multiple labels per person occurring frequently.

ACTION LOCALIZATION VIDEO UNDERSTANDING ZERO-SHOT ACTION RECOGNITION

Detect-and-Track: Efficient Pose Estimation in Videos

CVPR 2018 facebookresearch/DetectAndTrack

This paper addresses the problem of estimating and tracking human body keypoints in complex, multi-person video.

#4 best model for Pose Tracking on PoseTrack2017 (using extra training data)

HUMAN DETECTION MULTI-OBJECT TRACKING POSE ESTIMATION POSE TRACKING VIDEO UNDERSTANDING

TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition

30 Mar 2017jeffreyhuang1/two-stream-action-recognition

We demonstrate that using both RNNs (using LSTMs) and Temporal-ConvNets on spatiotemporal feature matrices are able to exploit spatiotemporal dynamics to improve the overall performance.

ACTION CLASSIFICATION ACTION RECOGNITION IN VIDEOS VIDEO UNDERSTANDING

TSM: Temporal Shift Module for Efficient Video Understanding

20 Nov 2018MIT-HAN-LAB/temporal-shift-module

The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.

VIDEO OBJECT DETECTION VIDEO RECOGNITION VIDEO UNDERSTANDING

Learnable pooling with Context Gating for video classification

21 Jun 2017antoine77340/Youtube-8M-WILLOW

In particular, we evaluate our method on the large-scale multi-modal Youtube-8M v2 dataset and outperform all other methods in the Youtube 8M Large-Scale Video Understanding challenge.

VIDEO CLASSIFICATION VIDEO UNDERSTANDING

ECO: Efficient Convolutional Network for Online Video Understanding

ECCV 2018 mzolfaghari/ECO-efficient-video-understanding

In this paper, we introduce a network architecture that takes long-term content into account and enables fast per-video processing at the same time.

#12 best model for Action Recognition In Videos on Something-Something V1 (using extra training data)

ACTION CLASSIFICATION ACTION RECOGNITION IN VIDEOS VIDEO CAPTIONING VIDEO RETRIEVAL VIDEO UNDERSTANDING

End-to-End Learning of Motion Representation for Video Understanding

CVPR 2018 LijieFan/tvnet

Despite the recent success of end-to-end learned representations, hand-crafted optical flow features are still widely used in video analysis tasks.

ACTION RECOGNITION IN VIDEOS OPTICAL FLOW ESTIMATION VIDEO UNDERSTANDING

Long-Term Feature Banks for Detailed Video Understanding

CVPR 2019 facebookresearch/video-long-term-feature-banks

To understand the world, we humans constantly need to relate the present to the past, and put events in context.

#3 best model for Action Classification on Charades (using extra training data)

ACTION CLASSIFICATION ACTION RECOGNITION IN VIDEOS VIDEO UNDERSTANDING

The Monkeytyping Solution to the YouTube-8M Video Understanding Challenge

16 Jun 2017wangheda/youtube-8m

This article describes the final solution of team monkeytyping, who finished in second place in the YouTube-8M video understanding challenge.

VIDEO CLASSIFICATION VIDEO UNDERSTANDING

Representation Flow for Action Recognition

CVPR 2019 piergiaj/representation-flow-cvpr19

Our representation flow layer is a fully-differentiable layer designed to capture the `flow' of any representation channel within a convolutional neural network for action recognition.

#2 best model for Action Classification on HMDB51 (using extra training data)

ACTION CLASSIFICATION ACTIVITY RECOGNITION ACTIVITY RECOGNITION IN VIDEOS OPTICAL FLOW ESTIMATION VIDEO CLASSIFICATION VIDEO UNDERSTANDING