Search Results for author: João Sacramento

Found 18 papers, 13 papers with code

Discovering modular solutions that generalize compositionally

1 code implementation • 22 Dec 2023 • Simon Schug, Seijin Kobayashi, Yassir Akram, Maciej Wołczyk, Alexandra Proca, Johannes von Oswald, Razvan Pascanu, João Sacramento, Angelika Steger

This allows us to relate the problem of compositional generalization to that of identification of the underlying modules.

Meta-Learning

Paper
Code

Uncovering mesa-optimization algorithms in Transformers

no code implementations • 11 Sep 2023 • Johannes von Oswald, Eyvind Niklasson, Maximilian Schlegel, Seijin Kobayashi, Nicolas Zucchet, Nino Scherrer, Nolan Miller, Mark Sandler, Blaise Agüera y Arcas, Max Vladymyrov, Razvan Pascanu, João Sacramento

Transformers have become the dominant model in deep learning, but the reason for their superior performance is poorly understood.

In-Context Learning Language Modelling

Paper
Add Code

Gated recurrent neural networks discover attention

no code implementations • 4 Sep 2023 • Nicolas Zucchet, Seijin Kobayashi, Yassir Akram, Johannes von Oswald, Maxime Larcher, Angelika Steger, João Sacramento

In particular, we examine RNNs trained to solve simple in-context learning tasks on which Transformers are known to excel and find that gradient descent instills in our RNNs the same attention-based in-context learning algorithm used by Transformers.

In-Context Learning

Paper
Add Code

Online learning of long-range dependencies

1 code implementation • NeurIPS 2023 • Nicolas Zucchet, Robert Meier, Simon Schug, Asier Mujika, João Sacramento

Online learning holds the promise of enabling efficient long-term credit assignment in recurrent neural networks.

Paper
Code

Transformers learn in-context by gradient descent

1 code implementation • 15 Dec 2022 • Johannes von Oswald, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, Max Vladymyrov

We start by providing a simple weight construction that shows the equivalence of data transformations induced by 1) a single linear self-attention layer and by 2) gradient-descent (GD) on a regression loss.

In-Context Learning Meta-Learning +1

278

Paper
Code

The least-control principle for local learning at equilibrium

1 code implementation • 4 Jul 2022 • Alexander Meulemans, Nicolas Zucchet, Seijin Kobayashi, Johannes von Oswald, João Sacramento

As special cases, they include models of great current interest in both neuroscience and machine learning, such as deep neural networks, equilibrium recurrent neural networks, deep equilibrium models, or meta-learning.

BIG-bench Machine Learning Meta-Learning

Paper
Code

Beyond backpropagation: bilevel optimization through implicit differentiation and equilibrium propagation

no code implementations • 6 May 2022 • Nicolas Zucchet, João Sacramento

This paper reviews gradient-based techniques to solve bilevel optimization problems.

Bilevel Optimization

Paper
Add Code

Minimizing Control for Credit Assignment with Strong Feedback

2 code implementations • 14 Apr 2022 • Alexander Meulemans, Matilde Tristany Farinha, Maria R. Cervera, João Sacramento, Benjamin F. Grewe

Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization.

Paper
Code

Learning where to learn: Gradient sparsity in meta and continual learning

1 code implementation • NeurIPS 2021 • Johannes von Oswald, Dominic Zhao, Seijin Kobayashi, Simon Schug, Massimo Caccia, Nicolas Zucchet, João Sacramento

We find that patterned sparsity emerges from this process, with the pattern of sparsity varying on a problem-by-problem basis.

Continual Learning Inductive Bias +2

Paper
Code

Credit Assignment in Neural Networks through Deep Feedback Control

3 code implementations • NeurIPS 2021 • Alexander Meulemans, Matilde Tristany Farinha, Javier García Ordóñez, Pau Vilimelis Aceituno, João Sacramento, Benjamin F. Grewe

The success of deep learning sparked interest in whether the brain learns by using similar techniques for assigning credit to each synaptic weight for its contribution to the network output.

Paper
Code

Conductance-based dendrites perform Bayes-optimal cue integration

no code implementations • 27 Apr 2021 • Jakob Jordan, João Sacramento, Willem A. M. Wybo, Mihai A. Petrovici, Walter Senn

We propose a novel, Bayesian view on the dynamics of conductance-based neurons and synapses which suggests that they are naturally equipped to optimally perform information integration.

Paper
Add Code

A contrastive rule for meta-learning

1 code implementation • 4 Apr 2021 • Nicolas Zucchet, Simon Schug, Johannes von Oswald, Dominic Zhao, João Sacramento

Humans and other animals are capable of improving their learning performance as they solve related tasks from a given problem domain, to the point of being able to learn from extremely limited data.

Meta-Learning

Paper
Code

Posterior Meta-Replay for Continual Learning

3 code implementations • NeurIPS 2021 • Christian Henning, Maria R. Cervera, Francesco D'Angelo, Johannes von Oswald, Regina Traber, Benjamin Ehret, Seijin Kobayashi, Benjamin F. Grewe, João Sacramento

We offer a practical deep learning implementation of our framework based on probabilistic task-conditioned hypernetworks, an approach we term posterior meta-replay.

Continual Learning

109

Paper
Code