Search Results for author: Seraphina Nix

Found 4 papers, 3 papers with code

HCAST: Human-Calibrated Autonomy Software Tasks

1 code implementation21 Mar 2025 David Rein, Joel Becker, Amy Deng, Seraphina Nix, Chris Canal, Daniel O'Connel, Pip Arnott, Ryan Bloom, Thomas Broadley, Katharyn Garcia, Brian Goodrich, Max Hasin, Sami Jawhar, Megan Kinniment, Thomas Kwa, Aron Lajko, Nate Rush, Lucas Jun Koba Sato, Sydney von Arx, Ben West, Lawrence Chan, Elizabeth Barnes

To understand and predict the societal impacts of highly autonomous AI systems, we need benchmarks with grounding, i. e., metrics that directly connect AI performance to real-world effects we care about.

Adversarial Training for High-Stakes Reliability

no code implementations3 May 2022 Daniel M. Ziegler, Seraphina Nix, Lawrence Chan, Tim Bauman, Peter Schmidt-Nielsen, Tao Lin, Adam Scherlis, Noa Nabeshima, Ben Weinstein-Raun, Daniel de Haas, Buck Shlegeris, Nate Thomas

We found that adversarial training increased robustness to the adversarial attacks that we trained on -- doubling the time for our contractors to find adversarial examples both with our tool (from 13 to 26 minutes) and without (from 20 to 44 minutes) -- without affecting in-distribution performance.

Text Generation Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.