Search Results for author: Adam Scherlis

Found 3 papers, 0 papers with code

Polysemanticity and Capacity in Neural Networks

no code implementations4 Oct 2022 Adam Scherlis, Kshitij Sachan, Adam S. Jermyn, Joe Benton, Buck Shlegeris

We show that in a toy model the optimal capacity allocation tends to monosemantically represent the most important features, polysemantically represent less important features (in proportion to their impact on the loss), and entirely ignore the least important features.

Adversarial Training for High-Stakes Reliability

no code implementations3 May 2022 Daniel M. Ziegler, Seraphina Nix, Lawrence Chan, Tim Bauman, Peter Schmidt-Nielsen, Tao Lin, Adam Scherlis, Noa Nabeshima, Ben Weinstein-Raun, Daniel de Haas, Buck Shlegeris, Nate Thomas

We found that adversarial training increased robustness to the adversarial attacks that we trained on -- doubling the time for our contractors to find adversarial examples both with our tool (from 13 to 26 minutes) and without (from 20 to 44 minutes) -- without affecting in-distribution performance.

Text Generation Vocal Bursts Intensity Prediction

The Goldilocks zone: Towards better understanding of neural network loss landscapes

no code implementations6 Jul 2018 Stanislav Fort, Adam Scherlis

We observe this effect for fully-connected neural networks over a range of network widths and depths on MNIST and CIFAR-10 datasets with the $\mathrm{ReLU}$ and $\tanh$ non-linearities, and a similar effect for convolutional networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.