Search Results for author: Lennart Justen

Found 3 papers, 0 papers with code

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

no code implementations • 5 Mar 2024 • Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer, Samuel Marks, Oam Patel, Andy Zou, Mantas Mazeika, Zifan Wang, Palash Oswal, Weiran Liu, Adam A. Hunt, Justin Tienken-Harder, Kevin Y. Shih, Kemper Talley, John Guan, Russell Kaplan, Ian Steneker, David Campbell, Brad Jokubaitis, Alex Levinson, Jean Wang, William Qian, Kallol Krishna Karmakar, Steven Basart, Stephen Fitz, Mindy Levine, Ponnurangam Kumaraguru, Uday Tupakula, Vijay Varadharajan, Ruoyu Wang, Yan Shoshitaishvili, Jimmy Ba, Kevin M. Esvelt, Alexandr Wang, Dan Hendrycks

To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs.

Multiple-choice

Paper
Add Code

Will releasing the weights of future large language models grant widespread access to pandemic agents?

no code implementations • 25 Oct 2023 • Anjali Gopal, Nathan Helm-Burger, Lennart Justen, Emily H. Soice, Tiffany Tzeng, Geetha Jeyapragasan, Simon Grimm, Benjamin Mueller, Kevin M. Esvelt

Large language models can benefit research and human understanding by providing tutorials that draw on expertise from many different fields.

Paper
Add Code

No Time Like the Present: Effects of Language Change on Automated Comment Moderation

no code implementations • 8 Jul 2022 • Lennart Justen, Kilian Müller, Marco Niemann, Jörg Becker

As a result, there is growing interest in using machine learning and natural language processing for (semi-) automated abusive language detection to avoid manual comment moderation costs or having to shut down comment sections altogether.

Abusive Language

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.