Search Results for author: Nan Rosemary Ke

Found 39 papers, 15 papers with code

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

2 code implementations30 May 2022 Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Nitesh B. Gundavarapu, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio

A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of $K$ time steps into a single representation which is divided into multiple vectors.

Decision Making Inductive Bias

Task Loss Estimation for Sequence Prediction

1 code implementation19 Nov 2015 Dzmitry Bahdanau, Dmitriy Serdyuk, Philémon Brakel, Nan Rosemary Ke, Jan Chorowski, Aaron Courville, Yoshua Bengio

Our idea is that this score can be interpreted as an estimate of the task loss, and that the estimation error may be used as a consistent surrogate loss.

Language Modelling speech-recognition +1

Learning Neural Causal Models from Unknown Interventions

2 code implementations2 Oct 2019 Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael C. Mozer, Chris Pal, Yoshua Bengio

Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.

Meta-Learning

Variational Walkback: Learning a Transition Operator as a Stochastic Recurrent Net

1 code implementation NeurIPS 2017 Anirudh Goyal, Nan Rosemary Ke, Surya Ganguli, Yoshua Bengio

The energy function is then modified so the model and data distributions match, with no guarantee on the number of steps required for the Markov chain to converge.

Towards Understanding How Machines Can Learn Causal Overhypotheses

1 code implementation16 Jun 2022 Eliza Kosoy, David M. Chan, Adrian Liu, Jasmine Collins, Bryanna Kaufmann, Sandy Han Huang, Jessica B. Hamrick, John Canny, Nan Rosemary Ke, Alison Gopnik

Recent work in machine learning and cognitive science has suggested that understanding causal information is essential to the development of intelligence.

BIG-bench Machine Learning Causal Inference

h-detach: Modifying the LSTM Gradient Towards Better Optimization

1 code implementation ICLR 2019 Devansh Arpit, Bhargav Kanuparthi, Giancarlo Kerg, Nan Rosemary Ke, Ioannis Mitliagkas, Yoshua Bengio

This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number of steps.

Cascading Bandits for Large-Scale Recommendation Problems

1 code implementation17 Mar 2016 Shi Zong, Hao Ni, Kenny Sung, Nan Rosemary Ke, Zheng Wen, Branislav Kveton

In this work, we study cascading bandits, an online learning variant of the cascade model where the goal is to recommend $K$ most attractive items from a large set of $L$ candidate items.

Multi-Armed Bandits Recommendation Systems +1

On the Convergence of Continuous Constrained Optimization for Structure Learning

1 code implementation23 Nov 2020 Ignavier Ng, Sébastien Lachapelle, Nan Rosemary Ke, Simon Lacoste-Julien, Kun Zhang

Recently, structure learning of directed acyclic graphs (DAGs) has been formulated as a continuous optimization problem by leveraging an algebraic characterization of acyclicity.

ACtuAL: Actor-Critic Under Adversarial Learning

no code implementations13 Nov 2017 Anirudh Goyal, Nan Rosemary Ke, Alex Lamb, R. Devon Hjelm, Chris Pal, Joelle Pineau, Yoshua Bengio

This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function.

Language Modelling

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks

no code implementations ICLR 2018 Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Laurent Charlin, Chris Pal, Yoshua Bengio

A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagate credit information backwards through every single step of the forward computation.

Transferring Knowledge from a RNN to a DNN

no code implementations7 Apr 2015 William Chan, Nan Rosemary Ke, Ian Lane

The small DNN trained on the soft RNN alignments achieved a 3. 93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4. 54 WER or more than 13% relative improvement.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding

no code implementations11 Sep 2018 Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio

We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state.

Temporal Sequences

Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding

no code implementations NeurIPS 2018 Nan Rosemary Ke, Anirudh Goyal Alias Parth Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio

We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state.

Temporal Sequences

Amortized learning of neural causal representations

no code implementations21 Aug 2020 Nan Rosemary Ke, Jane. X. Wang, Jovana Mitrovic, Martin Szummer, Danilo J. Rezende

The CRN represent causal models using continuous representations and hence could scale much better with the number of variables.

Dependency Structure Discovery from Interventions

no code implementations1 Jan 2021 Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Bernhard Schölkopf, Michael Curtis Mozer, Hugo Larochelle, Christopher Pal, Yoshua Bengio

Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.

Fast and Slow Learning of Recurrent Independent Mechanisms

no code implementations18 May 2021 Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio

To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks.

Meta-Learning

Learning to Induce Causal Structure

no code implementations11 Apr 2022 Nan Rosemary Ke, Silvia Chiappa, Jane Wang, Anirudh Goyal, Jorg Bornschein, Melanie Rey, Theophane Weber, Matthew Botvinic, Michael Mozer, Danilo Jimenez Rezende

The fundamental challenge in causal induction is to infer the underlying graph structure given observational and/or interventional data.

On the Generalization and Adaption Performance of Causal Models

no code implementations9 Jun 2022 Nino Scherrer, Anirudh Goyal, Stefan Bauer, Yoshua Bengio, Nan Rosemary Ke

Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes and offer robust generalization.

Causal Discovery Out-of-Distribution Generalization

Learning Latent Structural Causal Models

no code implementations24 Oct 2022 Jithendaraa Subramanian, Yashas Annadani, Ivaxi Sheth, Nan Rosemary Ke, Tristan Deleu, Stefan Bauer, Derek Nowrouzezahrai, Samira Ebrahimi Kahou

For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions.

Bayesian Inference Image Generation +1

Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

no code implementations8 Feb 2023 Chentian Jiang, Nan Rosemary Ke, Hado van Hasselt

To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks.

Bayesian Inference Thompson Sampling

Cannot find the paper you are looking for? You can Submit a new open access paper.