In particular, we propose a tree based autoencoder to encode discrete text data into continuous vector space, upon which we optimize the adversarial perturbation.
Given a state-of-the-art deep neural network text classifier, we show the existence of a universal and very small perturbation vector (in the embedding space) that causes natural text to be misclassified with high probability.
Adversarial examples are artificially modified input samples which lead to misclassifications, while not being detectable by humans.
In this paper we formulate the attacks with discrete input on a set function as an optimization task.
This is largely because sequences of text are discrete, and thus gradients cannot propagate from the discriminator to the generator.
Inspired by the success of self attention mechanism and Transformer architecture in sequence transduction and image generation applications, we propose novel self attention-based architectures to improve the performance of adversarial latent code- based schemes in text generation.