Temporal Generative Adversarial Nets with Singular Value Clipping

ICCV 2017  ·  Masaki Saito, Eiichi Matsumoto, Shunta Saito ·

In this paper, we propose a generative model, Temporal Generative Adversarial Nets (TGAN), which can learn a semantic representation of unlabeled videos, and is capable of generating videos. Unlike existing Generative Adversarial Nets (GAN)-based methods that generate videos with a single generator consisting of 3D deconvolutional layers, our model exploits two different types of generators: a temporal generator and an image generator. The temporal generator takes a single latent variable as input and outputs a set of latent variables, each of which corresponds to an image frame in a video. The image generator transforms a set of such latent variables into a video. To deal with instability in training of GAN with such advanced networks, we adopt a recently proposed model, Wasserstein GAN, and propose a novel method to train it stably in an end-to-end manner. The experimental results demonstrate the effectiveness of our methods.

PDF Abstract ICCV 2017 PDF ICCV 2017 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Generation UCF-101 16 frames, 64x64, Unconditional TGAN-SVC Inception Score 11.85 # 6
Video Generation UCF-101 16 frames, Unconditional, Single GPU TGAN-SVC Inception Score 11.85 # 6