3 code implementations • 13 Oct 2021 • Shlomi Hod, Daniel Filan, Stephen Casper, Andrew Critch, Stuart Russell
These results suggest that graph-based partitioning can reveal local specialization and that statistical methods can be used to automatedly screen for sets of neurons that can be understood abstractly.
no code implementations • 29 Sep 2021 • Shlomi Hod, Stephen Casper, Daniel Filan, Cody Wild, Andrew Critch, Stuart Russell
These results suggest that graph-based partitioning can reveal modularity and help us understand how deep neural networks function.
2 code implementations • 4 Mar 2021 • Daniel Filan, Stephen Casper, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell
We also exhibit novel methods to promote clusterability in neural network training, and find that in multi-layer perceptrons they lead to more clusterable networks with little reduction in accuracy.
no code implementations • 1 Jan 2021 • Shlomi Hod, Stephen Casper, Daniel Filan, Cody Wild, Andrew Critch, Stuart Russell
We apply these methods on partitionings generated by a spectral clustering algorithm which uses a graph representation of the network's neurons and weights.
2 code implementations • 8 Nov 2020 • Gavin Brown, Shlomi Hod, Iden Kalemaj
We propose a theoretical framework where the response of a target population to the deployed classifier is modeled as a function of the classifier and the current state (distribution) of the population.
1 code implementation • 10 Mar 2020 • Daniel Filan, Shlomi Hod, Cody Wild, Andrew Critch, Stuart Russell
To discern structure in these weights, we introduce a measurable notion of modularity for multi-layer perceptrons (MLPs), and investigate the modular structure of MLPs trained on datasets of small images.