Search Results for author: Joe Kwon

Found 3 papers, 3 papers with code

Explore, Establish, Exploit: Red Teaming Language Models from Scratch

3 code implementations • 15 Jun 2023 • Stephen Casper, Jason Lin, Joe Kwon, Gatlen Culp, Dylan Hadfield-Menell

Using a pre-existing classifier does not allow for red-teaming to be tailored to the target model.

Paper
Code

Forecasting Future World Events with Neural Networks

1 code implementation • 30 Jun 2022 • Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks

We test language models on our forecasting task and find that performance is far below a human expert baseline.

Decision Making Language Modelling

173

Paper
Code

Scaling Out-of-Distribution Detection for Real-World Settings

3 code implementations • 25 Nov 2019 • Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joe Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, Dawn Song

We conduct extensive experiments in these more realistic settings for out-of-distribution detection and find that a surprisingly simple detector based on the maximum logit outperforms prior methods in all the large-scale multi-class, multi-label, and segmentation tasks, establishing a simple new baseline for future work.

Out-of-Distribution Detection Segmentation +2

151

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.