Search Results for author: Alexander Warstadt

Found 2 papers, 1 papers with code

A Geometric Notion of Causal Probing

no code implementations • 27 Jul 2023 • Clément Guerner, Anej Svete, Tianyu Liu, Alexander Warstadt, Ryan Cotterell

The linear subspace hypothesis (Bolukbasi et al., 2016) states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace.

counterfactual Language Modelling

Paper
Add Code

Generalizing Backpropagation for Gradient-Based Interpretability

1 code implementation • 6 Jul 2023 • Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.