TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Self-supervised Video Retrieval	HMDB51	VCP (R3D)	Top-1	7.6	# 11
Self-Supervised Action Recognition	HMDB51	VCP (R3D)	Top-1 Accuracy	31.5	# 44
Self-Supervised Action Recognition	HMDB51	VCP (R3D)	Pre-Training Dataset	UCF101	# 1
Self-Supervised Action Recognition	HMDB51	VCP (R3D)	Frozen	false	# 1
Self-Supervised Action Recognition	UCF101	VCP (R3D)	3-fold Accuracy	66	# 41
Self-Supervised Action Recognition	UCF101	VCP (R3D)	Pre-Training Dataset	UCF101	# 1
Self-Supervised Action Recognition	UCF101	VCP (R3D)	Frozen	false	# 1
Self-supervised Video Retrieval	UCF101	VCP (R3D)	Top-1	18.6	# 13

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/video-cloze-procedure-for-self-supervised/self-supervised-video-retrieval-on-hmdb51)](https://paperswithcode.com/sota/self-supervised-video-retrieval-on-hmdb51?p=video-cloze-procedure-for-self-supervised)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/video-cloze-procedure-for-self-supervised/self-supervised-video-retrieval-on-ucf101)](https://paperswithcode.com/sota/self-supervised-video-retrieval-on-ucf101?p=video-cloze-procedure-for-self-supervised)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/video-cloze-procedure-for-self-supervised/self-supervised-action-recognition-on-ucf101)](https://paperswithcode.com/sota/self-supervised-action-recognition-on-ucf101?p=video-cloze-procedure-for-self-supervised)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/video-cloze-procedure-for-self-supervised/self-supervised-action-recognition-on-hmdb51)](https://paperswithcode.com/sota/self-supervised-action-recognition-on-hmdb51?p=video-cloze-procedure-for-self-supervised)`

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

2 Jan 2020 · Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma, Qixiang Ye, Weiping Wang ·

We propose a novel self-supervised method, referred to as Video Cloze Procedure (VCP), to learn rich spatial-temporal representations. VCP first generates "blanks" by withholding video clips and then creates "options" by applying spatio-temporal operations on the withheld clips. Finally, it fills the blanks with "options" and learns representations by predicting the categories of operations applied on the clips. VCP can act as either a proxy task or a target task in self-supervised learning. As a proxy task, it converts rich self-supervised representations into video clip operations (options), which enhances the flexibility and reduces the complexity of representation learning. As a target task, it can assess learned representation models in a uniform and interpretable manner. With VCP, we train spatial-temporal representation models (3D-CNNs) and apply such models on action recognition and video retrieval tasks. Experiments on commonly used benchmarks show that the trained models outperform the state-of-the-art self-supervised models with significant margins.

PDF Abstract

Code

Add Remove Mark official

BestJuly/VCP

Tasks

Add Remove

Action Recognition

Representation Learning

Retrieval

Self-Supervised Action Recognition

Self-Supervised Learning

Self-supervised Video Retrieval

Video Retrieval

Datasets

ImageNet

UCF101

HMDB51

Results from the Paper

Edit

Ranked #11 on Self-supervised Video Retrieval on HMDB51

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Self-supervised Video Retrieval	HMDB51	VCP (R3D)	Top-1	7.6	# 11	Compare
Self-Supervised Action Recognition	HMDB51	VCP (R3D)	Top-1 Accuracy	31.5	# 44	Compare
			Pre-Training Dataset	UCF101	# 1	Compare
			Frozen	false	# 1	Compare
Self-Supervised Action Recognition	UCF101	VCP (R3D)	3-fold Accuracy	66	# 41	Compare
			Pre-Training Dataset	UCF101	# 1	Compare
			Frozen	false	# 1	Compare
Self-supervised Video Retrieval	UCF101	VCP (R3D)	Top-1	18.6	# 13	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove