TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Video Saliency Detection	DHF1K	TASED-Net	NSS	2.667	# 3
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	SIM	0.610	# 3
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	CC	0.710	# 2
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	NSS	1.96	# 3
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	AUC-J	0.852	# 4
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	KLDiv	0.538	# 4
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	FPS	1.85	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/tased-net-temporally-aggregating-spatial/video-saliency-detection-on-dhf1k)](https://paperswithcode.com/sota/video-saliency-detection-on-dhf1k?p=tased-net-temporally-aggregating-spatial)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/tased-net-temporally-aggregating-spatial/video-saliency-detection-on-msu-video)](https://paperswithcode.com/sota/video-saliency-detection-on-msu-video?p=tased-net-temporally-aggregating-spatial)`

TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection

ICCV 2019 · Kyle Min, Jason J. Corso ·

TASED-Net is a 3D fully-convolutional network architecture for video saliency detection. It consists of two building blocks: first, the encoder network extracts low-resolution spatiotemporal features from an input clip of several consecutive frames, and then the following prediction network decodes the encoded features spatially while aggregating all the temporal information. As a result, a single prediction map is produced from an input clip of multiple frames. Frame-wise saliency maps can be predicted by applying TASED-Net in a sliding-window fashion to a video. The proposed approach assumes that the saliency map of any frame can be predicted by considering a limited number of past frames. The results of our extensive experiments on video saliency detection validate this assumption and demonstrate that our fully-convolutional model with temporal aggregation method is effective. TASED-Net significantly outperforms previous state-of-the-art approaches on all three major large-scale datasets of video saliency detection: DHF1K, Hollywood2, and UCFSports. After analyzing the results qualitatively, we observe that our model is especially better at attending to salient moving objects.

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract

Code

Add Remove Mark official

kylemin/TASED-Net official

102

Tasks

Add Remove

Saliency Detection

Video Saliency Detection

Datasets

Kinetics

SALICON

DHF1K

MSU Video Saliency Prediction

Results from the Paper

Edit

Ranked #3 on Video Saliency Detection on MSU Video Saliency Prediction

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Video Saliency Detection	DHF1K	TASED-Net	NSS	2.667	# 3	Compare
Video Saliency Detection	MSU Video Saliency Prediction	TASED-Net	SIM	0.610	# 3	Compare
			CC	0.710	# 2	Compare
			NSS	1.96	# 3	Compare
			AUC-J	0.852	# 4	Compare
			KLDiv	0.538	# 4	Compare
			FPS	1.85	# 12	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove