Center Smoothing: Certified Robustness for Networks with Structured Outputs

NeurIPS 2021  ·  Aounon Kumar, Tom Goldstein ·

The study of provable adversarial robustness has mostly been limited to classification tasks and models with one-dimensional real-valued outputs. We extend the scope of certifiable robustness to problems with more general and structured outputs like sets, images, language, etc. We model the output space as a metric space under a distance/similarity function, such as intersection-over-union, perceptual similarity, total variation distance, etc. Such models are used in many machine learning problems like image segmentation, object detection, generative models, image/audio-to-text systems, etc. Based on a robustness technique called randomized smoothing, our $\textit{center smoothing}$ procedure can produce models with the guarantee that the change in the output, as measured by the distance metric, remains small for any norm-bounded adversarial perturbation of the input. We apply our method to create certifiably robust models with disparate output spaces - from sets to images - and show that it yields meaningful certificates without significantly degrading the performance of the base model. Code for our experiments is available at: https://github.com/aounon/center-smoothing.

PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here