Video Recognition

55 papers with code • 0 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Greatest papers with code

MoViNets: Mobile Video Networks for Efficient Video Recognition

tensorflow/models CVPR 2021

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.

Action Classification Action Recognition +2

Multiscale Vision Transformers

facebookresearch/SlowFast 22 Apr 2021

We evaluate this fundamental architectural prior for modeling the dense nature of visual signals for a variety of video recognition tasks where it outperforms concurrent vision transformers that rely on large scale external pre-training and are 5-10x more costly in computation and parameters.

Action Classification Action Recognition +2

X3D: Expanding Architectures for Efficient Video Recognition

facebookresearch/SlowFast CVPR 2020

This paper presents X3D, a family of efficient video networks that progressively expand a tiny 2D image classification architecture along multiple network axes, in space, time, width and depth.

Action Classification Feature Selection +4

Audiovisual SlowFast Networks for Video Recognition

facebookresearch/SlowFast 23 Jan 2020

We present Audiovisual SlowFast Networks, an architecture for integrated audiovisual perception.

Action Classification Video Recognition

Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?

kenshohara/3D-ResNets-PyTorch 10 Apr 2020

Therefore, in the present paper, we conduct exploration study in order to improve spatiotemporal 3D CNNs as follows: (i) Recently proposed large-scale video datasets help improve spatiotemporal 3D CNNs in terms of video classification accuracy.

General Classification Video Classification +1

Omni-sourced Webly-supervised Learning for Video Recognition

open-mmlab/mmaction ECCV 2020

Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning.

Ranked #2 on Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition +1

Sequence Level Semantics Aggregation for Video Object Detection

open-mmlab/mmtracking ICCV 2019

In this work, we argue that aggregating features in the full-sequence level will lead to more discriminative and robust features for video object detection.

Video Object Detection Video Recognition