Search Results for author: Lucas Sato

Found 2 papers, 2 papers with code

Localizing Model Behavior with Path Patching

1 code implementation12 Apr 2023 Nicholas Goldowsky-Dill, Chris MacLeod, Lucas Sato, Aryaman Arora

Localizing behaviors of neural networks to a subset of the network's components or a subset of interactions between components is a natural first step towards analyzing network mechanisms and possible failure modes.

model

Cannot find the paper you are looking for? You can Submit a new open access paper.