Generating adversarial examples is an intriguing problem and an important way
of understanding the working mechanism of deep neural networks. Most existing
approaches generated perturbations in the image space, i.e., each pixel can be
modified independently. However, in this paper we pay special attention to the
subset of adversarial examples that correspond to meaningful changes in 3D
physical properties (like rotation and translation, illumination condition,
etc.). These adversaries arguably pose a more serious concern, as they
demonstrate the possibility of causing neural network failure by easy
perturbations of real-world 3D objects and scenes.
In the contexts of object classification and visual question answering, we
augment state-of-the-art deep neural networks that receive 2D input images with
a rendering module (either differentiable or not) in front, so that a 3D scene
(in the physical space) is rendered into a 2D image (in the image space), and
then mapped to a prediction (in the output space). The adversarial
perturbations can now go beyond the image space, and have clear meanings in the
3D physical world. Though image-space adversaries can be interpreted as
per-pixel albedo change, we verify that they cannot be well explained along
these physically meaningful dimensions, which often have a non-local effect.
But it is still possible to successfully attack beyond the image space on the
physical space, though this is more difficult than image-space attacks,
reflected in lower success rates and heavier perturbations required.