1 code implementation • 6 Feb 2024 • Nora Belrose, Quintin Pope, Lucia Quirke, Alex Mallen, Xiaoli Fern
The distributional simplicity bias (DSB) posits that neural networks learn low-order moments of the data distribution first, before moving on to higher-order correlations.
1 code implementation • 5 Dec 2023 • Michael Igorevich Ivanitskiy, Alex F. Spies, Tilman Räuker, Guillaume Corlouer, Chris Mathwin, Lucia Quirke, Can Rager, Rusheb Shah, Dan Valentine, Cecilia Diniz Behn, Katsumi Inoue, Samy Wu Fung
Transformer models underpin many recent advances in practical machine learning applications, yet understanding their internal behavior continues to elude researchers.
1 code implementation • 1 Nov 2023 • Lucia Quirke, Lovis Heindrich, Wes Gurnee, Neel Nanda
We show that this neuron exists within a broader contextual n-gram circuit: we find late layer neurons which recognize and continue n-grams common in German text, but which only activate if the German neuron is active.
1 code implementation • 19 Sep 2023 • Michael Igorevich Ivanitskiy, Rusheb Shah, Alex F. Spies, Tilman Räuker, Dan Valentine, Can Rager, Lucia Quirke, Chris Mathwin, Guillaume Corlouer, Cecilia Diniz Behn, Samy Wu Fung
Understanding how machine learning models respond to distributional shifts is a key research challenge.