no code implementations • 7 Jun 2024 • Weiran Lin, Anna Gerchanovsky, Omer Akgul, Lujo Bauer, Matt Fredrikson, Zifan Wang
Writing effective prompts for large language models (LLM) can be unintuitive and burdensome.
1 code implementation • 29 Jun 2023 • Weiran Lin, Keane Lucas, Neo Eyal, Lujo Bauer, Michael K. Reiter, Mahmood Sharif
In this work, we identify real-world scenarios where the true threat cannot be assessed accurately by existing attacks.
no code implementations • 27 Feb 2023 • Keane Lucas, Matthew Jagielski, Florian Tramèr, Lujo Bauer, Nicholas Carlini
It is becoming increasingly imperative to design robust ML defenses.
1 code implementation • NeurIPS 2023 • Zhuoqun Huang, Neil G. Marchant, Keane Lucas, Lujo Bauer, Olga Ohrimenko, Benjamin I. P. Rubinstein
When applied to the popular MalConv malware detection model, our smoothing mechanism RS-Del achieves a certified accuracy of 91% at an edit distance radius of 128 bytes.
1 code implementation • 28 Dec 2021 • Weiran Lin, Keane Lucas, Lujo Bauer, Michael K. Reiter, Mahmood Sharif
First, we demonstrate a loss function that explicitly encodes (1) and show that Auto-PGD finds more attacks with it.
1 code implementation • 19 Dec 2019 • Keane Lucas, Mahmood Sharif, Lujo Bauer, Michael K. Reiter, Saurabh Shintre
Moreover, we found that our attack can fool some commercial anti-viruses, in certain cases with a success rate of 85%.
no code implementations • 19 Dec 2019 • Mahmood Sharif, Lujo Bauer, Michael K. Reiter
This paper proposes a new defense called $n$-ML against adversarial examples, i. e., inputs crafted by perturbing benign inputs by small amounts to induce misclassifications by classifiers.
no code implementations • 27 Feb 2018 • Mahmood Sharif, Lujo Bauer, Michael K. Reiter
Combined with prior work, we thus demonstrate that nearness of inputs as measured by $L_p$-norms is neither necessary nor sufficient for perceptual similarity, which has implications for both creating and defending against adversarial examples.
3 code implementations • 31 Dec 2017 • Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, Michael K. Reiter
Images perturbed subtly to be misclassified by neural networks, called adversarial examples, have emerged as a technically deep challenge and an important concern for several application domains.