Search Results for author: Shahar Katz

Found 2 papers, 1 papers with code

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

no code implementations • 20 Feb 2024 • Shahar Katz, Yonatan Belinkov, Mor Geva, Lior Wolf

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community.

Language Modelling

Paper
Add Code

VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers

2 code implementations • 22 May 2023 • Shahar Katz, Yonatan Belinkov

Recent advances in interpretability suggest we can project weights and hidden states of transformer-based language models (LMs) to their vocabulary, a transformation that makes them more human interpretable.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.