Many deep learning models are vulnerable to the adversarial attack, i.e.,
imperceptible but intentionally-designed perturbations to the input can cause
incorrect output of the networks. In this paper, using information geometry, we
provide a reasonable explanation for the vulnerability of deep learning models...
By considering the data space as a non-linear space with the Fisher information
metric induced from a neural network, we first propose an adversarial attack
algorithm termed one-step spectral attack (OSSA). The method is described by a
constrained quadratic form of the Fisher information matrix, where the optimal
adversarial perturbation is given by the first eigenvector, and the model
vulnerability is reflected by the eigenvalues. The larger an eigenvalue is, the
more vulnerable the model is to be attacked by the corresponding eigenvector. Taking advantage of the property, we also propose an adversarial detection
method with the eigenvalues serving as characteristics. Both our attack and
detection algorithms are numerically optimized to work efficiently on large
datasets. Our evaluations show superior performance compared with other
methods, implying that the Fisher information is a promising approach to
investigate the adversarial attacks and defenses.