Have You Stolen My Model? Evasion Attacks Against Deep Neural Network Watermarking Techniques

3 Sep 2018 · Dorjan Hitaj, Luigi V. Mancini ·

Deep neural networks have had enormous impact on various domains of computer science, considerably outperforming previous state of the art machine learning techniques. To achieve this performance, neural networks need large quantities of data and huge computational resources, which heavily increases their construction costs. The increased cost of building a good deep neural network model gives rise to a need for protecting this investment from potential copyright infringements. Legitimate owners of a machine learning model want to be able to reliably track and detect a malicious adversary that tries to steal the intellectual property related to the model. Recently, this problem was tackled by introducing in deep neural networks the concept of watermarking, which allows a legitimate owner to embed some secret information(watermark) in a given model. The watermark allows the legitimate owner to detect copyright infringements of his model. This paper focuses on verifying the robustness and reliability of state-of- the-art deep neural network watermarking schemes. We show that, a malicious adversary, even in scenarios where the watermark is difficult to remove, can still evade the verification by the legitimate owners, thus avoiding the detection of model theft.

PDF Abstract