Search Results for author: Russell Kaplan

Found 4 papers, 2 papers with code

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

no code implementations • 5 Mar 2024 • Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer, Sam Marks, Oam Patel, Andy Zou, Mantas Mazeika, Zifan Wang, Palash Oswal, Weiran Liu, Adam A. Hunt, Justin Tienken-Harder, Kevin Y. Shih, Kemper Talley, John Guan, Russell Kaplan, Ian Steneker, David Campbell, Brad Jokubaitis, Alex Levinson, Jean Wang, William Qian, Kallol Krishna Karmakar, Steven Basart, Stephen Fitz, Mindy Levine, Ponnurangam Kumaraguru, Uday Tupakula, Vijay Varadharajan, Yan Shoshitaishvili, Jimmy Ba, Kevin M. Esvelt, Alexandr Wang, Dan Hendrycks

To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs.

Multiple-choice

Paper
Add Code

Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs

no code implementations • 28 Apr 2023 • George Pu, Anirudh Jain, Jihan Yin, Russell Kaplan

As foundation models continue to exponentially scale in size, efficient methods of adaptation become increasingly critical.

Paper
Add Code

HiDDeN: Hiding Data With Deep Networks

6 code implementations • ECCV 2018 • Jiren Zhu, Russell Kaplan, Justin Johnson, Li Fei-Fei

We show that these encodings are competitive with existing data hiding algorithms, and further that they can be made robust to noise: our models learn to reconstruct hidden information in an encoded image despite the presence of Gaussian blurring, pixel-wise dropout, cropping, and JPEG compression.

293

Paper
Code

Beating Atari with Natural Language Guided Reinforcement Learning

1 code implementation • 18 Apr 2017 • Russell Kaplan, Christopher Sauer, Alexander Sosa

We introduce the first deep reinforcement learning agent that learns to beat Atari games with the aid of natural language instructions.

Montezuma's Revenge OpenAI Gym +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.