no code implementations • 15 Sep 2021 • Jens Hauser, Zhao Meng, Damián Pascual, Roger Wattenhofer
We combine a human evaluation of individual word substitutions and a probabilistic analysis to show that between 96% and 99% of the analyzed attacks do not preserve semantics, indicating that their success is mainly based on feeding poor data to the model.