1 code implementation • 12 Apr 2023 • Nicholas Goldowsky-Dill, Chris MacLeod, Lucas Sato, Aryaman Arora
Localizing behaviors of neural networks to a subset of the network's components or a subset of interactions between components is a natural first step towards analyzing network mechanisms and possible failure modes.