About

Benchmarks

No evaluation results yet. Help compare methods by submit evaluation metrics.

Datasets

Greatest papers with code

MoViNets: Mobile Video Networks for Efficient Video Recognition

21 Mar 2021tensorflow/models

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

ACTION CLASSIFICATION ACTION RECOGNITION NEURAL ARCHITECTURE SEARCH VIDEO RECOGNITION

X3D: Expanding Architectures for Efficient Video Recognition

CVPR 2020 facebookresearch/SlowFast

This paper presents X3D, a family of efficient video networks that progressively expand a tiny 2D image classification architecture along multiple network axes, in space, time, width and depth.

ACTION CLASSIFICATION FEATURE SELECTION IMAGE CLASSIFICATION VIDEO CLASSIFICATION VIDEO RECOGNITION

Audiovisual SlowFast Networks for Video Recognition

23 Jan 2020facebookresearch/SlowFast

We present Audiovisual SlowFast Networks, an architecture for integrated audiovisual perception.

ACTION CLASSIFICATION ACTION CLASSIFICATION VIDEO RECOGNITION

Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?

10 Apr 2020kenshohara/3D-ResNets-PyTorch

Therefore, in the present paper, we conduct exploration study in order to improve spatiotemporal 3D CNNs as follows: (i) Recently proposed large-scale video datasets help improve spatiotemporal 3D CNNs in terms of video classification accuracy.

VIDEO CLASSIFICATION VIDEO RECOGNITION

Omni-sourced Webly-supervised Learning for Video Recognition

ECCV 2020 open-mmlab/mmaction

Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning.

 Ranked #1 on Action Classification on Kinetics-400 (using extra training data)

ACTION CLASSIFICATION ACTION RECOGNITION VIDEO RECOGNITION

TSM: Temporal Shift Module for Efficient Video Understanding

ICCV 2019 MIT-HAN-LAB/temporal-shift-module

The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.

Ranked #4 on Action Recognition on Something-Something V2 (using extra training data)

ACTION CLASSIFICATION ACTION RECOGNITION VIDEO OBJECT DETECTION VIDEO RECOGNITION VIDEO UNDERSTANDING

Sequence Level Semantics Aggregation for Video Object Detection

ICCV 2019 open-mmlab/mmtracking

In this work, we argue that aggregating features in the full-sequence level will lead to more discriminative and robust features for video object detection.

VIDEO OBJECT DETECTION VIDEO RECOGNITION

Flow-Guided Feature Aggregation for Video Object Detection

ICCV 2017 open-mmlab/mmtracking

The accuracy of detection suffers from degenerated object appearances in videos, e. g., motion blur, video defocus, rare poses, etc.

VIDEO OBJECT DETECTION VIDEO RECOGNITION