Provably Robust Transfer

Knowledge transfer is an effective tool for learning, especially when labeled data is scarce or when training from scratch is prohibitively costly. The overwhelming majority of transfer learning literature is focused on obtaining accurate models, neglecting the issue of adversarial robustness. Yet, robustness is essential, particularly when transferring to safety-critical domains. We analyze and improve the robustness of a popular transfer learning framework consisting of two parts: a feature extractor and a classifier which is re-trained on the target domain. Our experiments show how adversarial training on the source domain affects robustness on source and target domain, and we propose the first provably robust transfer learning models. We obtain strong robustness guarantees by bounding the worst-case change in the extracted features while controlling the Lipschitz constant of the classifier. Our models maintain high accuracy while significantly improving provable robustness.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here