Search Results for author: Andrew Saxe

Found 23 papers, 7 papers with code

Revisiting the Role of Relearning in Semantic Dementia

no code implementations5 Mar 2025 Devon Jarvis, Verena Klar, Richard Klein, Benjamin Rosman, Andrew Saxe

While relearning of lost knowledge has been shown in acute brain injuries such as stroke, it has not been widely supported in chronic cognitive diseases such as SD.

Training Dynamics of In-Context Learning in Linear Attention

no code implementations27 Jan 2025 Yedi Zhang, Aaditya K. Singh, Peter E. Latham, Andrew Saxe

For the merged parametrization, we show the training dynamics has two fixed points and the loss trajectory exhibits a single, abrupt drop.

In-Context Learning

Early learning of the optimal constant solution in neural networks and humans

no code implementations25 Jun 2024 Jirko Rubruck, Jan P. Bauer, Andrew Saxe, Christopher Summerfield

We identify hallmarks of this early OCS phase and illustrate how these signatures are observed in deep linear networks and larger, more complex (and nonlinear) convolutional neural networks solving a hierarchical learning task based on MNIST and CIFAR10.

When Are Bias-Free ReLU Networks Effectively Linear Networks?

no code implementations18 Jun 2024 Yedi Zhang, Andrew Saxe, Peter E. Latham

We then show that, under symmetry conditions on the data, these networks have the same learning dynamics as linear networks.

Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning

1 code implementation10 Jun 2024 Daniel Kunin, Allan Raventós, Clémentine Dominé, Feng Chen, David Klindt, Andrew Saxe, Surya Ganguli

While the impressive performance of modern neural networks is often attributed to their capacity to efficiently extract task-relevant features from data, the mechanisms underlying this rich feature learning regime remain elusive, with much of our theoretical understanding stemming from the opposing lazy regime.

Tilting the Odds at the Lottery: the Interplay of Overparameterisation and Curricula in Neural Networks

1 code implementation3 Jun 2024 Stefano Sarao Mannelli, Yaraslau Ivashynka, Andrew Saxe, Luca Saglietti

A wide range of empirical and theoretical works have shown that overparameterisation can amplify the performance of neural networks.

Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning

no code implementations28 Feb 2024 Jin Hwa Lee, Stefano Sarao Mannelli, Andrew Saxe

Diverse studies in systems neuroscience begin with extended periods of curriculum training known as `shaping' procedures.

Deep Reinforcement Learning reinforcement-learning

Understanding Unimodal Bias in Multimodal Deep Linear Networks

1 code implementation1 Dec 2023 Yedi Zhang, Peter E. Latham, Andrew Saxe

This is the first work to calculate the duration of the unimodal phase in learning as a function of the depth at which modalities are fused within the network, dataset statistics, and initialization.

Are task representations gated in macaque prefrontal cortex?

no code implementations29 Jun 2023 Timo Flesch, Valerio Mante, William Newsome, Andrew Saxe, Christopher Summerfield, David Sussillo

A recent paper (Flesch et al, 2022) describes behavioural and neural data suggesting that task representations are gated in the prefrontal cortex in both humans and macaques.

Know your audience: specializing grounded language models with listener subtraction

no code implementations16 Jun 2022 Aaditya K. Singh, David Ding, Andrew Saxe, Felix Hill, Andrew K. Lampinen

Through controlled experiments, we show that training a speaker with two listeners that perceive differently, using our method, allows the speaker to adapt to the idiosyncracies of the listeners.

Language Modelling Large Language Model

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

1 code implementation18 May 2022 Sebastian Lee, Stefano Sarao Mannelli, Claudia Clopath, Sebastian Goldt, Andrew Saxe

Continual learning - learning new tasks in sequence while maintaining performance on old tasks - remains particularly challenging for artificial neural networks.

Continual Learning

Modelling continual learning in humans with Hebbian context gating and exponentially decaying task signals

1 code implementation22 Mar 2022 Timo Flesch, David G. Nagy, Andrew Saxe, Christopher Summerfield

Here, we propose novel computational constraints for artificial neural networks, inspired by earlier work on gating in the primate prefrontal cortex, that capture the cost of interleaved training and allow the network to learn two tasks in sequence without forgetting.

Continual Learning

Continual Learning in the Teacher-Student Setup: Impact of Task Similarity

1 code implementation9 Jul 2021 Sebastian Lee, Sebastian Goldt, Andrew Saxe

Using each teacher to represent a different task, we investigate how the relationship between teachers affects the amount of forgetting and transfer exhibited by the student when the task switches.

Continual Learning

An Analytical Theory of Curriculum Learning in Teacher-Student Networks

no code implementations15 Jun 2021 Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe

To study the former, we provide an exact description of the online learning setting, confirming the long-standing experimental observation that curricula can modestly speed up learning.

Probing transfer learning with a model of synthetic correlated datasets

no code implementations9 Jun 2021 Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe, Lenka Zdeborová

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task.

Binary Classification Transfer Learning

Characterizing emergent representations in a space of candidate learning rules for deep networks

no code implementations NeurIPS 2020 Yinan Cao, Christopher Summerfield, Andrew Saxe

Studies suggesting that representations in deep networks resemble those in biological brains have mostly relied on one specific learning rule: gradient descent, the workhorse behind modern deep learning.

If deep learning is the answer, then what is the question?

no code implementations16 Apr 2020 Andrew Saxe, Stephanie Nelli, Christopher Summerfield

In this Perspective, our goal is to offer a roadmap for systems neuroscience research in the age of deep learning.

Neurons and Cognition

Are Efficient Deep Representations Learnable?

no code implementations17 Jul 2018 Maxwell Nye, Andrew Saxe

Specifically, we train deep neural networks to learn two simple functions with known efficient solutions: the parity function and the fast Fourier transform.

Deep Learning

Tensor Switching Networks

1 code implementation NeurIPS 2016 Chuan-Yung Tsai, Andrew Saxe, David Cox

We present a novel neural network algorithm, the Tensor Switching (TS) network, which generalizes the Rectified Linear Unit (ReLU) nonlinearity to tensor-valued hidden units.

Representation Learning

Unsupervised learning models of primary cortical receptive fields and receptive field plasticity

no code implementations NeurIPS 2011 Maneesh Bhand, Ritvik Mudur, Bipin Suresh, Andrew Saxe, Andrew Y. Ng

In this work we focus on that component of adaptation which occurs during an organism's lifetime, and show that a number of unsupervised feature learning algorithms can account for features of normal receptive field properties across multiple primary sensory cortices.

Measuring Invariances in Deep Networks

no code implementations NeurIPS 2009 Ian Goodfellow, Honglak Lee, Quoc V. Le, Andrew Saxe, Andrew Y. Ng

Our evaluation metrics can also be used to evaluate future work in unsupervised deep learning, and thus help the development of future algorithms.

Cannot find the paper you are looking for? You can Submit a new open access paper.