TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Multimodal Activity Recognition	EV-Action	TSN (RGB)	Accuracy	73.6	# 3
Action Recognition	HMDB-51	Temporal Segment Networks	Average accuracy of 3 splits	69.4	# 55
Action Classification	Kinetics-400	TSN	Acc@1	73.9	# 157
Action Classification	Kinetics-400	TSN	Acc@5	91.1	# 114
Action Recognition	UCF101	Temporal Segment Networks	3-fold Accuracy	94.2	# 53

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/temporal-segment-networks-towards-good/multimodal-activity-recognition-on-ev-action)](https://paperswithcode.com/sota/multimodal-activity-recognition-on-ev-action?p=temporal-segment-networks-towards-good)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/temporal-segment-networks-towards-good/action-recognition-in-videos-on-ucf101)](https://paperswithcode.com/sota/action-recognition-in-videos-on-ucf101?p=temporal-segment-networks-towards-good)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/temporal-segment-networks-towards-good/action-recognition-in-videos-on-hmdb-51)](https://paperswithcode.com/sota/action-recognition-in-videos-on-hmdb-51?p=temporal-segment-networks-towards-good)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/temporal-segment-networks-towards-good/action-classification-on-kinetics-400)](https://paperswithcode.com/sota/action-classification-on-kinetics-400?p=temporal-segment-networks-towards-good)`

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

2 Aug 2016 · Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc van Gool ·

Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 ( $ 69.4\% $) and UCF101 ($ 94.2\% $). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices.

PDF Abstract

Code

Add Remove Mark official

yjxiong/temporal-segment-networks official

1,513

yjxiong/caffe official

550

open-mmlab/mmaction2

3,887

MIT-HAN-LAB/temporal-shift-module

2,019

mindspore-ai/models

334

See all 19 implementations

Tasks

Add Remove

Action Classification

Action Recognition

Action Recognition In Videos

Multimodal Activity Recognition

Temporal Action Localization

Datasets

ImageNet

UCF101

Kinetics

HMDB51

Kinetics 400

Results from the Paper

Edit

Ranked #3 on Multimodal Activity Recognition on EV-Action

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Multimodal Activity Recognition	EV-Action	TSN (RGB)	Accuracy	73.6	# 3	Compare
Action Recognition	HMDB-51	Temporal Segment Networks	Average accuracy of 3 splits	69.4	# 55	Compare
Action Recognition	UCF101	Temporal Segment Networks	3-fold Accuracy	94.2	# 53	Compare

Results from Other Papers

Task	Dataset	Model	Metric Name	Metric Value	Rank	Source Paper	Compare
Action Classification	Kinetics-400	TSN	Acc@1	73.9	# 157		See all
Action Classification	Kinetics-400	TSN	Acc@5	91.1	# 114		See all

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Temporal Segment Networks: Towards Good Practices for Deep Action Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit