TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Video Prediction	Human3.6M	FRNN	SSIM	0.771	# 7
Video Prediction	Human3.6M	FRNN	MSE	497.7	# 5
Video Prediction	Human3.6M	FRNN	MAE	1901.1	# 5
Video Prediction	KTH	fRNN	PSNR	26.12	# 21
Video Prediction	KTH	fRNN	SSIM	0.771	# 25
Video Prediction	KTH	fRNN	Cond	10	# 1
Video Prediction	KTH	fRNN	Pred	20	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/folded-recurrent-neural-networks-for-future/video-prediction-on-kth)](https://paperswithcode.com/sota/video-prediction-on-kth?p=folded-recurrent-neural-networks-for-future)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/folded-recurrent-neural-networks-for-future/video-prediction-on-human36m)](https://paperswithcode.com/sota/video-prediction-on-human36m?p=folded-recurrent-neural-networks-for-future)`

Folded Recurrent Neural Networks for Future Video Prediction

ECCV 2018 · Marc Oliu, Javier Selva, Sergio Escalera ·

Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.

PDF Abstract ECCV 2018 PDF ECCV 2018 Abstract

Code

Add Remove Mark official

moliusimon/frnn official

Tasks

Add Remove

Specificity

Video Prediction

Datasets

UCF101

Human3.6M

KTH

Moving MNIST

Results from the Paper

Edit

Ranked #1 on Video Prediction on KTH (Cond metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Video Prediction	Human3.6M	FRNN	SSIM	0.771	# 7	Compare
			MSE	497.7	# 5	Compare
			MAE	1901.1	# 5	Compare
Video Prediction	KTH	fRNN	PSNR	26.12	# 21	Compare
			SSIM	0.771	# 25	Compare
			Cond	10	# 1	Compare
			Pred	20	# 1	Compare

Methods

Add Remove

GRU

Edit Social Preview

Folded Recurrent Neural Networks for Future Video Prediction

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove