no code implementations • 12 Apr 2023 • Nan Rosemary Ke, Sara-Jane Dunn, Jorg Bornschein, Silvia Chiappa, Melanie Rey, Jean-Baptiste Lespiau, Albin Cassirer, Jane Wang, Theophane Weber, David Barrett, Matthew Botvinick, Anirudh Goyal, Mike Mozer, Danilo Rezende
To accurately identify GRNs, perturbational data is required.
no code implementations • 8 Feb 2023 • Chentian Jiang, Nan Rosemary Ke, Hado van Hasselt
To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks.
no code implementations • 24 Oct 2022 • Jithendaraa Subramanian, Yashas Annadani, Ivaxi Sheth, Nan Rosemary Ke, Tristan Deleu, Stefan Bauer, Derek Nowrouzezahrai, Samira Ebrahimi Kahou
For linear Gaussian additive noise SCMs, we present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent SCM from random, known interventions.
1 code implementation • 16 Jun 2022 • Eliza Kosoy, David M. Chan, Adrian Liu, Jasmine Collins, Bryanna Kaufmann, Sandy Han Huang, Jessica B. Hamrick, John Canny, Nan Rosemary Ke, Alison Gopnik
Recent work in machine learning and cognitive science has suggested that understanding causal information is essential to the development of intelligence.
no code implementations • 9 Jun 2022 • Nino Scherrer, Anirudh Goyal, Stefan Bauer, Yoshua Bengio, Nan Rosemary Ke
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes and offer robust generalization.
2 code implementations • 30 May 2022 • Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Nitesh B. Gundavarapu, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio
A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of $K$ time steps into a single representation which is divided into multiple vectors.
no code implementations • 11 Apr 2022 • Nan Rosemary Ke, Silvia Chiappa, Jane Wang, Anirudh Goyal, Jorg Bornschein, Melanie Rey, Theophane Weber, Matthew Botvinic, Michael Mozer, Danilo Jimenez Rezende
The fundamental challenge in causal induction is to infer the underlying graph structure given observational and/or interventional data.
no code implementations • 21 Feb 2022 • Eliza Kosoy, Adrian Liu, Jasmine Collins, David M Chan, Jessica B Hamrick, Nan Rosemary Ke, Sandy H Huang, Bryanna Kaufmann, John Canny, Alison Gopnik
Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research.
no code implementations • 17 Feb 2022 • Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adria Puigdomenech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Laurent SIfre, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell
In this paper we explore an alternative paradigm in which we train a network to map a dataset of past experiences to optimal behavior.
1 code implementation • 6 Sep 2021 • Nino Scherrer, Olexa Bilaniuk, Yashas Annadani, Anirudh Goyal, Patrick Schwab, Bernhard Schölkopf, Michael C. Mozer, Yoshua Bengio, Stefan Bauer, Nan Rosemary Ke
Discovering causal structures from data is a challenging inference problem of fundamental importance in all areas of science.
1 code implementation • 2 Jul 2021 • Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Rezende, Yoshua Bengio, Michael Mozer, Christopher Pal
A central goal for AI and causality is thus the joint discovery of abstract representations and causal structure.
no code implementations • 18 May 2021 • Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio
To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks.
no code implementations • NeurIPS 2021 • Anirudh Goyal, Aniket Didolkar, Nan Rosemary Ke, Charles Blundell, Philippe Beaudoin, Nicolas Heess, Michael Mozer, Yoshua Bengio
First, GNNs do not predispose interactions to be sparse, as relationships among independent entities are likely to be.
1 code implementation • ICLR 2022 • Anirudh Goyal, Aniket Didolkar, Alex Lamb, Kartikeya Badola, Nan Rosemary Ke, Nasim Rahaman, Jonathan Binas, Charles Blundell, Michael Mozer, Yoshua Bengio
We explore the use of such a communication channel in the context of deep learning for modeling the structure of complex environments.
no code implementations • 22 Feb 2021 • Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, Yoshua Bengio
The two fields of machine learning and graphical causality arose and developed separately.
no code implementations • ICLR 2021 • Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio
Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution.
no code implementations • 1 Jan 2021 • Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Bernhard Schölkopf, Michael Curtis Mozer, Hugo Larochelle, Christopher Pal, Yoshua Bengio
Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.
1 code implementation • 23 Nov 2020 • Ignavier Ng, Sébastien Lachapelle, Nan Rosemary Ke, Simon Lacoste-Julien, Kun Zhang
Recently, structure learning of directed acyclic graphs (DAGs) has been formulated as a continuous optimization problem by leveraging an algebraic characterization of acyclicity.
no code implementations • 21 Aug 2020 • Nan Rosemary Ke, Jane. X. Wang, Jovana Mitrovic, Martin Szummer, Danilo J. Rezende
The CRN represent causal models using continuous representations and hence could scale much better with the number of variables.
no code implementations • 7 Feb 2020 • Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing
In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions.
2 code implementations • 2 Oct 2019 • Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Hugo Larochelle, Bernhard Schölkopf, Michael C. Mozer, Chris Pal, Yoshua Bengio
Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.
no code implementations • ICLR 2019 • Nan Rosemary Ke, Amanpreet Singh, Ahmed Touati, Anirudh Goyal, Yoshua Bengio, Devi Parikh, Dhruv Batra
This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration.
no code implementations • 5 Mar 2019 • Nan Rosemary Ke, Amanpreet Singh, Ahmed Touati, Anirudh Goyal, Yoshua Bengio, Devi Parikh, Dhruv Batra
This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration.
no code implementations • NeurIPS 2018 • Nan Rosemary Ke, Anirudh Goyal Alias Parth Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio
We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state.
1 code implementation • ICLR 2019 • Devansh Arpit, Bhargav Kanuparthi, Giancarlo Kerg, Nan Rosemary Ke, Ioannis Mitliagkas, Yoshua Bengio
This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number of steps.
no code implementations • 11 Sep 2018 • Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Michael C. Mozer, Chris Pal, Yoshua Bengio
We consider the hypothesis that such memory associations between past and present could be used for credit assignment through arbitrarily long sequences, propagating the credit assigned to the current state to the associated past state.
no code implementations • ICML 2018 • Nan Rosemary Ke, Konrad Zolna, Alessandro Sordoni, Zhouhan Lin, Adam Trischler, Yoshua Bengio, Joelle Pineau, Laurent Charlin, Chris Pal
We evaluate this method on several types of tasks with different attributes.
Ranked #3 on
Open-Domain Question Answering
on SearchQA
(Unigram Acc metric)
no code implementations • 20 Jan 2018 • Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeswar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio
We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition.
1 code implementation • 24 Nov 2017 • Peter Henderson, Koustuv Sinha, Nicolas Angelard-Gontier, Nan Rosemary Ke, Genevieve Fried, Ryan Lowe, Joelle Pineau
The use of dialogue systems as a medium for human-machine interaction is an increasingly prevalent paradigm.
1 code implementation • NeurIPS 2017 • Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, Yoshua Bengio
Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech.
no code implementations • 13 Nov 2017 • Anirudh Goyal, Nan Rosemary Ke, Alex Lamb, R. Devon Hjelm, Chris Pal, Joelle Pineau, Yoshua Bengio
This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function.
1 code implementation • NeurIPS 2017 • Anirudh Goyal, Nan Rosemary Ke, Surya Ganguli, Yoshua Bengio
The energy function is then modified so the model and data distributions match, with no guarantee on the number of steps required for the Markov chain to converge.
no code implementations • ICLR 2018 • Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Laurent Charlin, Chris Pal, Yoshua Bengio
A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagate credit information backwards through every single step of the forward computation.
no code implementations • 7 Sep 2017 • Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeshwar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio
By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble.
2 code implementations • ICLR 2018 • Dmitriy Serdyuk, Nan Rosemary Ke, Alessandro Sordoni, Adam Trischler, Chris Pal, Yoshua Bengio
We propose a simple technique for encouraging generative RNNs to plan ahead.
6 code implementations • 3 Jun 2016 • David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Aaron Courville, Chris Pal
We propose zoneout, a novel method for regularizing RNNs.
1 code implementation • 17 Mar 2016 • Shi Zong, Hao Ni, Kenny Sung, Nan Rosemary Ke, Zheng Wen, Branislav Kveton
In this work, we study cascading bandits, an online learning variant of the cascade model where the goal is to recommend $K$ most attractive items from a large set of $L$ candidate items.
1 code implementation • 19 Nov 2015 • Dzmitry Bahdanau, Dmitriy Serdyuk, Philémon Brakel, Nan Rosemary Ke, Jan Chorowski, Aaron Courville, Yoshua Bengio
Our idea is that this score can be interpreted as an estimate of the task loss, and that the estimation error may be used as a consistent surrogate loss.
no code implementations • 7 Apr 2015 • William Chan, Nan Rosemary Ke, Ian Lane
The small DNN trained on the soft RNN alignments achieved a 3. 93 WER on the Wall Street Journal (WSJ) eval92 task compared to a baseline 4. 54 WER or more than 13% relative improvement.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1