Search Results for author: Shahar Katz

Found 2 papers, 1 papers with code

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

no code implementations20 Feb 2024 Shahar Katz, Yonatan Belinkov, Mor Geva, Lior Wolf

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community.

Language Modelling

VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers

2 code implementations22 May 2023 Shahar Katz, Yonatan Belinkov

Recent advances in interpretability suggest we can project weights and hidden states of transformer-based language models (LMs) to their vocabulary, a transformation that makes them more human interpretable.

Cannot find the paper you are looking for? You can Submit a new open access paper.