Search Results for author: Lucia Quirke

Found 4 papers, 4 papers with code

Neural Networks Learn Statistics of Increasing Complexity

1 code implementation6 Feb 2024 Nora Belrose, Quintin Pope, Lucia Quirke, Alex Mallen, Xiaoli Fern

The distributional simplicity bias (DSB) posits that neural networks learn low-order moments of the data distribution first, before moving on to higher-order correlations.

Structured World Representations in Maze-Solving Transformers

1 code implementation5 Dec 2023 Michael Igorevich Ivanitskiy, Alex F. Spies, Tilman Räuker, Guillaume Corlouer, Chris Mathwin, Lucia Quirke, Can Rager, Rusheb Shah, Dan Valentine, Cecilia Diniz Behn, Katsumi Inoue, Samy Wu Fung

Transformer models underpin many recent advances in practical machine learning applications, yet understanding their internal behavior continues to elude researchers.

valid

Training Dynamics of Contextual N-Grams in Language Models

1 code implementation1 Nov 2023 Lucia Quirke, Lovis Heindrich, Wes Gurnee, Neel Nanda

We show that this neuron exists within a broader contextual n-gram circuit: we find late layer neurons which recognize and continue n-grams common in German text, but which only activate if the German neuron is active.

Cannot find the paper you are looking for? You can Submit a new open access paper.