Search Results for author: Weiran Lin

Found 4 papers, 3 papers with code

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

1 code implementation • 5 Mar 2024 • Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer, Samuel Marks, Oam Patel, Andy Zou, Mantas Mazeika, Zifan Wang, Palash Oswal, Weiran Lin, Adam A. Hunt, Justin Tienken-Harder, Kevin Y. Shih, Kemper Talley, John Guan, Russell Kaplan, Ian Steneker, David Campbell, Brad Jokubaitis, Alex Levinson, Jean Wang, William Qian, Kallol Krishna Karmakar, Steven Basart, Stephen Fitz, Mindy Levine, Ponnurangam Kumaraguru, Uday Tupakula, Vijay Varadharajan, Ruoyu Wang, Yan Shoshitaishvili, Jimmy Ba, Kevin M. Esvelt, Alexandr Wang, Dan Hendrycks

To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs.

Multiple-choice

Paper
Code

Nonparametric Discrete Choice Experiments with Machine Learning Guided Adaptive Design

no code implementations • 18 Oct 2023 • Mingzhang Yin, Ruijiang Gao, Weiran Lin, Steven M. Shugan

Cross-pollinating the machine learning and experiment design, GBS is scalable to products with hundreds of attributes and can design personalized products for heterogeneous consumers.

Paper
Add Code

Group-based Robustness: A General Framework for Customized Robustness in the Real World

1 code implementation • 29 Jun 2023 • Weiran Lin, Keane Lucas, Neo Eyal, Lujo Bauer, Michael K. Reiter, Mahmood Sharif

In this work, we identify real-world scenarios where the true threat cannot be assessed accurately by existing attacks.

Paper
Code

Constrained Gradient Descent: A Powerful and Principled Evasion Attack Against Neural Networks

1 code implementation • 28 Dec 2021 • Weiran Lin, Keane Lucas, Lujo Bauer, Michael K. Reiter, Mahmood Sharif

First, we demonstrate a loss function that explicitly encodes (1) and show that Auto-PGD finds more attacks with it.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.