Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning

13 Apr 2017  ·  Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Shin Ishii ·

We propose a new regularization method based on virtual adversarial loss: a new measure of local smoothness of the conditional label distribution given input. Virtual adversarial loss is defined as the robustness of the conditional label distribution around each input data point against local perturbation. Unlike adversarial training, our method defines the adversarial direction without label information and is hence applicable to semi-supervised learning. Because the directions in which we smooth the model are only "virtually" adversarial, we call our method virtual adversarial training (VAT). The computational cost of VAT is relatively low. For neural networks, the approximated gradient of virtual adversarial loss can be computed with no more than two pairs of forward- and back-propagations. In our experiments, we applied VAT to supervised and semi-supervised learning tasks on multiple benchmark datasets. With a simple enhancement of the algorithm based on the entropy minimization principle, our VAT achieves state-of-the-art performance for semi-supervised learning tasks on SVHN and CIFAR-10.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semi-Supervised Image Classification cifar10, 250 Labels VAT Percentage correct 63.97 # 4
Semi-Supervised Image Classification CIFAR-10, 250 Labels VAT Percentage error 36.03 # 20
Semi-Supervised Image Classification CIFAR-10, 4000 Labels VAT Percentage error 11.36 # 39
Semi-Supervised Image Classification CIFAR-10, 4000 Labels VAT+EntMin Percentage error 10.55 # 37
Semi-Supervised Image Classification SVHN, 1000 labels VAT Accuracy 94.58 # 15

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Semi-Supervised Image Classification SVHN, 250 Labels VAT Accuracy 91.59 # 13

Methods


No methods listed for this paper. Add relevant methods here