Search Results for author: Giancarlo Kerg

Found 8 papers, 3 papers with code

On Neural Architecture Inductive Biases for Relational Tasks

1 code implementation9 Jun 2022 Giancarlo Kerg, Sarthak Mittal, David Rolnick, Yoshua Bengio, Blake Richards, Guillaume Lajoie

Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems.

Inductive Bias Out-of-Distribution Generalization

Continuous-Time Meta-Learning with Forward Mode Differentiation

no code implementations ICLR 2022 Tristan Deleu, David Kanaa, Leo Feng, Giancarlo Kerg, Yoshua Bengio, Guillaume Lajoie, Pierre-Luc Bacon

Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.

Few-Shot Image Classification Meta-Learning

Advantages of biologically-inspired adaptive neural activation in RNNs during learning

no code implementations22 Jun 2020 Victor Geadah, Giancarlo Kerg, Stefan Horoi, Guy Wolf, Guillaume Lajoie

Dynamic adaptation in single-neuron response plays a fundamental role in neural coding in biological neural networks.

Transfer Learning

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

1 code implementation NeurIPS 2019 Giancarlo Kerg, Kyle Goyette, Maximilian Puelma Touzel, Gauthier Gidel, Eugene Vorontsov, Yoshua Bengio, Guillaume Lajoie

A recent strategy to circumvent the exploding and vanishing gradient problem in RNNs, and to allow the stable propagation of signals over long time scales, is to constrain recurrent connectivity matrices to be orthogonal or unitary.

h-detach: Modifying the LSTM Gradient Towards Better Optimization

1 code implementation ICLR 2019 Devansh Arpit, Bhargav Kanuparthi, Giancarlo Kerg, Nan Rosemary Ke, Ioannis Mitliagkas, Yoshua Bengio

This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number of steps.

Cannot find the paper you are looking for? You can Submit a new open access paper.