no code implementations • 31 Jul 2023 • Xinyu Zhang, Hanbin Hong, Yuan Hong, Peng Huang, Binghui Wang, Zhongjie Ba, Kui Ren
The language models, especially the basic text classification models, have been shown to be susceptible to textual adversarial attacks such as synonym substitution and word insertion attacks.
no code implementations • 10 Apr 2023 • Hanbin Hong, Yuan Hong
To craft the adversarial examples with the certifiable attack success rate (CASR) guarantee, we design several novel techniques, including a randomized query method to query the target model, an initialization method with smoothed self-supervised perturbation to derive certifiable adversarial examples, and a geometric shifting method to reduce the perturbation size of the certifiable adversarial examples for better imperceptibility.
no code implementations • 12 Jul 2022 • Hanbin Hong, Yuan Hong
However, all of the existing methods rely on fixed i. i. d.
no code implementations • 5 Jul 2022 • Hanbin Hong, Binghui Wang, Yuan Hong
We study certified robustness of machine learning classifiers against adversarial perturbations.
no code implementations • 2 Feb 2022 • Hanbin Hong, Yuan Hong, Yu Kong
In this paper, we show that the gradients can also be exploited as a powerful weapon to defend against adversarial attacks.