no code implementations • 27 Jul 2023 • Clément Guerner, Anej Svete, Tianyu Liu, Alexander Warstadt, Ryan Cotterell
The linear subspace hypothesis (Bolukbasi et al., 2016) states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace.
1 code implementation • 6 Jul 2023 • Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell
Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs.