Search Results for author: Alexander Warstadt

Found 2 papers, 1 papers with code

A Geometric Notion of Causal Probing

no code implementations27 Jul 2023 Clément Guerner, Anej Svete, Tianyu Liu, Alexander Warstadt, Ryan Cotterell

The linear subspace hypothesis (Bolukbasi et al., 2016) states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace.

counterfactual Language Modelling

Generalizing Backpropagation for Gradient-Based Interpretability

1 code implementation6 Jul 2023 Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

Many popular feature-attribution methods for interpreting deep neural networks rely on computing the gradients of a model's output with respect to its inputs.

Cannot find the paper you are looking for? You can Submit a new open access paper.