Search Results for author: Bilal Chughtai

Found 2 papers, 1 papers with code

Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs

no code implementations • 11 Feb 2024 • Bilal Chughtai, Alan Cooney, Neel Nanda

How do transformer-based large language models (LLMs) store and retrieve knowledge?

Paper
Add Code

A Toy Model of Universality: Reverse Engineering How Networks Learn Group Operations

1 code implementation • 6 Feb 2023 • Bilal Chughtai, Lawrence Chan, Neel Nanda

Universality is a key hypothesis in mechanistic interpretability -- that different models learn similar features and circuits when trained on similar tasks.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.