1 code implementation • ACL 2022 • Yikang Shen, Shawn Tan, Alessandro Sordoni, Peng Li, Jie zhou, Aaron Courville
We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task.
1 code implementation • 17 Jul 2023 • Tim Cooijmans, Milad Aghajohari, Aaron Courville
Gradient-based learning in multi-agent systems is difficult because the gradient derives from a first-order model which does not account for the interaction between agents' learning processes.
1 code implementation • 30 May 2023 • Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro
We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark.
no code implementations • 26 May 2023 • Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan
Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact algorithms, making them a tempting domain to apply machine learning methods.
no code implementations • 11 Feb 2023 • Dinghuai Zhang, Ling Pan, Ricky T. Q. Chen, Aaron Courville, Yoshua Bengio
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating complex combinatorial structure through a series of decision-making steps.
1 code implementation • 1 Feb 2023 • Taoli Cheng, Aaron Courville
As a classical generative modeling approach, energy-based models have the natural advantage of flexibility in the form of the energy function.
no code implementations • 15 Nov 2022 • Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron Courville, Behnam Neyshabur, Hanie Sedghi
Large language models (LLMs) have shown increasing in-context learning capabilities through scaling up model and data size.
no code implementations • 15 Nov 2022 • Arian Hosseini, Ankit Vani, Dzmitry Bahdanau, Alessandro Sordoni, Aaron Courville
In this work, we look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
no code implementations • 7 Oct 2022 • Ling Pan, Dinghuai Zhang, Aaron Courville, Longbo Huang, Yoshua Bengio
We specify intermediate rewards by intrinsic motivation to tackle the exploration problem in sparse reward environments.
1 code implementation • 3 Oct 2022 • Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen
While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity.
no code implementations • 24 Sep 2022 • Sai Rajeswar, Pietro Mazzaglia, Tim Verbelen, Alexandre Piché, Bart Dhoedt, Aaron Courville, Alexandre Lacoste
In this work, we study the URLB and propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent, and a task-aware fine-tuning strategy combined with a new proposed hybrid planner, Dyna-MPC, to adapt the agent for downstream tasks.
no code implementations • 16 Aug 2022 • Chin-wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, Aaron Courville
In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation.
no code implementations • 30 Jun 2022 • Kyle Kastner, Aaron Courville
This paper introduces R-MelNet, a two-part autoregressive architecture with a frontend based on the first tier of MelNet and a backend WaveRNN-style audio decoder for neural text-to-speech synthesis.
1 code implementation • 7 Jun 2022 • Dinghuai Zhang, Hongyang Zhang, Aaron Courville, Yoshua Bengio, Pradeep Ravikumar, Arun Sai Suggala
Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks.
1 code implementation • 3 Jun 2022 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare
To address these issues, we present reincarnating RL as an alternative workflow or class of problem settings, where prior computational work (e. g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another.
no code implementations • 2 Jun 2022 • Yuchen Lu, Zhen Liu, Aristide Baratin, Romain Laroche, Aaron Courville, Alessandro Sordoni
We propose to use the Intrinsic Dimension (ID) to assess expressiveness and introduce Cluster Learnability (CL) to assess learnability.
no code implementations • 1st International Workshop on Practical Deep Learning in the Wild, Association for the Advancement of Artificial Intelligence (AAAI) 2022 • Lluis Castrejon, Nicolas Ballas, Aaron Courville
Videos can be created by first outlining a global view of the scene and then adding local details.
1 code implementation • 16 May 2022 • Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville
This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later.
Ranked #4 on
Atari Games 100k
on Atari 100k
1 code implementation • 1 Apr 2022 • Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Ankit Vani, Michael Noukhovitch, Kenji Kawaguchi, Aaron Courville
Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation.
2 code implementations • 3 Feb 2022 • Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Volokhova, Aaron Courville, Yoshua Bengio
We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data.
1 code implementation • ICLR 2022 • Hattie Zhou, Ankit Vani, Hugo Larochelle, Aaron Courville
Forgetting is often seen as an unwanted characteristic in both human and machine learning.
no code implementations • 18 Jan 2022 • Taoli Cheng, Aaron Courville
We leverage representation learning and the inductive bias in neural-net-based Standard Model jet classification tasks, to detect non-QCD signal jets.
1 code implementation • ICLR 2022 • Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi Anna Huang, Jesse Engel
Musical expression requires control of both what notes are played, and how they are performed.
no code implementations • ICLR 2022 • Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine
In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations.
no code implementations • CVPR 2022 • Sai Rajeswar, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville
We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision.
Ranked #6 on
Image Classification
on WebVision-1000
1 code implementation • ICLR 2022 • Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio
We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression.
no code implementations • ICLR 2022 • Dinghuai Zhang, Jie Fu, Yoshua Bengio, Aaron Courville
Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on the pharmaceutical industry.
no code implementations • 29 Sep 2021 • Siyuan Zhou, Yikang Shen, Yuchen Lu, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan
With the isolation of information and the synchronous calling mechanism, we can impose a division of works between the controller and options in an end-to-end training regime.
no code implementations • ICLR 2022 • Shawn Tan, Chin-wei Huang, Alessandro Sordoni, Aaron Courville
Addtionally, since the support of the marginal $q(z)$ is bounded and the support of prior $p(z)$ is not, we propose renormalising the prior distribution over the support of $q(z)$.
no code implementations • 29 Sep 2021 • Lluis Castrejon, Nicolas Ballas, Aaron Courville
Each object representation defines a localized neural radiance field that is used to generate 2D views of the scene through a differentiable rendering process.
Ranked #6 on
Video Object Tracking
on CATER
no code implementations • 29 Sep 2021 • Sai Rajeswar Mudumba, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville
This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data.
no code implementations • 29 Sep 2021 • Yuchen Lu, Zhen Liu, Alessandro Sordoni, Aristide Baratin, Romain Laroche, Aaron Courville
In this work, we argue that representations induced by self-supervised learning (SSL) methods should both be expressive and learnable.
no code implementations • 22 Sep 2021 • Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare
Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).
1 code implementation • NeurIPS 2021 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare
Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs.
1 code implementation • NeurIPS 2021 • Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville
Data efficiency is a key challenge for deep reinforcement learning.
Ranked #2 on
Atari Games 100k
on Atari 100k
(using extra training data)
no code implementations • 5 Jun 2021 • Dinghuai Zhang, Kartik Ahuja, Yilun Xu, Yisen Wang, Aaron Courville
Can models with particular structure avoid being biased towards spurious correlation in out-of-distribution (OOD) generalization?
1 code implementation • NeurIPS 2021 • Chin-wei Huang, Jae Hyun Lim, Aaron Courville
Under this framework, we show that minimizing the score-matching loss is equivalent to maximizing a lower bound of the likelihood of the plug-in reverse SDE proposed by Song et al. (2021), bridging the theoretical gap.
no code implementations • 4 Jun 2021 • Lluis Castrejon, Nicolas Ballas, Aaron Courville
Inspired by this we propose a hierarchical model for video generation which follows a coarse to fine approach.
1 code implementation • NAACL 2021 • Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, R Devon Hjelm, Alessandro Sordoni, Aaron Courville
To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus.
no code implementations • ICLR 2021 • Ankit Vani, Max Schwarzer, Yuchen Lu, Eeshan Dhekane, Aaron Courville
Although neural module networks have an architectural bias towards compositionality, they require gold standard layouts to generalize systematically in practice.
1 code implementation • 1 Apr 2021 • Sai Rajeswar, Cyril Ibrahim, Nitin Surya, Florian Golemo, David Vazquez, Aaron Courville, Pedro O. Pinheiro
Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks that involve contact-rich motion.
no code implementations • 19 Mar 2021 • Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan
The discovered subtask hierarchy could be used to perform task decomposition, recovering the subtask boundaries in an unstruc-tured demonstration.
no code implementations • ICLR Workshop SSL-RL 2021 • Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, R Devon Hjelm, Philip Bachman, Aaron Courville
Data efficiency poses a major challenge for deep reinforcement learning.
2 code implementations • 4 Mar 2021 • Hadi Nekoei, Akilesh Badrinaaraayanan, Aaron Courville, Sarath Chandar
Its large strategy space makes it a desirable environment for lifelong RL tasks.
1 code implementation • 25 Jan 2021 • Michael Noukhovitch, Travis LaCroix, Angeliki Lazaridou, Aaron Courville
First, we show that communication is proportional to cooperation, and it can occur for partially competitive scenarios using standard learning algorithms.
no code implementations • ICLR 2021 • Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, Zhanxing Zhu
We consider the fundamental problem of how to automatically construct summary statistics for likelihood-free inference where the evaluation of likelihood function is intractable but sampling / simulating data from the model is possible.
no code implementations • 1 Jan 2021 • Lluis Castrejon, Nicolas Ballas, Aaron Courville
Current state-of-the-art generative models for videos have high computational requirements that impede high resolution generations beyond a few frames.
no code implementations • ICLR 2021 • Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville
We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features.
no code implementations • ICLR 2021 • Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan
Many complex real-world tasks are composed of several levels of sub-tasks.
2 code implementations • ICLR 2021 • Chin-wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville
Flow-based models are powerful tools for designing probabilistic models with tractable density.
2 code implementations • ACL 2021 • Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville
There are two major classes of natural language grammar -- the dependency grammar that models one-to-one correspondences between words and the constituency grammar that models the assembly of one or several corresponded words.
no code implementations • pproximateinference AABI Symposium 2021 • Jae Hyun Lim, Chin-wei Huang, Aaron Courville, Christopher Pal
In this work, we propose Bijective-Contrastive Estimation (BCE), a classification-based learning criterion for energy-based models.
2 code implementations • NeurIPS 2021 • Mohammad Pezeshki, Sékou-Oumar Kaba, Yoshua Bengio, Aaron Courville, Doina Precup, Guillaume Lajoie
We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks.
Ranked #1 on
Out-of-Distribution Generalization
on ImageNet-W
no code implementations • NeurIPS 2020 • Pedro O. Pinheiro, Amjad Almahairi, Ryan Y. Benmalek, Florian Golemo, Aaron Courville
VADeR provides a natural representation for dense prediction tasks and transfers well to downstream tasks.
no code implementations • 22 Oct 2020 • Rithesh Kumar, Kundan Kumar, Vicki Anand, Yoshua Bengio, Aaron Courville
In this paper, we propose NU-GAN, a new method for resampling audio from lower to higher sampling rates (upsampling).
no code implementations • NAACL 2021 • Yikang Shen, Shawn Tan, Alessandro Sordoni, Siva Reddy, Aaron Courville
In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM).
1 code implementation • 20 Oct 2020 • Yanzhi Chen, Dinghuai Zhang, Michael Gutmann, Aaron Courville, Zhanxing Zhu
We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation of the likelihood function is intractable, but sampling data from the model is possible.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shawn Tan, Yikang Shen, Timothy J. O'Donnell, Alessandro Sordoni, Aaron Courville
We model the recursive production property of context-free grammars for natural and synthetic languages.
no code implementations • EMNLP 2020 • Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville
Language drift has been one of the major obstacles to train language models through interaction.
1 code implementation • ICLR 2021 • Samuel Lavoie, Faruk Ahmed, Aaron Courville
While unsupervised domain translation (UDT) has seen a lot of success recently, we argue that mediating its translation via categorical semantic features could broaden its applicability.
1 code implementation • ICLR 2021 • Max Schwarzer, Ankesh Anand, Rishab Goel, R. Devon Hjelm, Aaron Courville, Philip Bachman
We further improve performance by adding data augmentation to the future prediction loss, which forces the agent's representations to be consistent across multiple views of an observation.
Ranked #5 on
Atari Games 100k
on Atari 100k
1 code implementation • ICCV 2021 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky
However, test images might contain zero- and few-shot compositions of objects and relationships, e. g. <cup, on, surfboard>.
2 code implementations • ICML 2020 • Jae Hyun Lim, Aaron Courville, Christopher Pal, Chin-wei Huang
Entropy is ubiquitous in machine learning, but it is in general intractable to compute the entropy of the distribution of an arbitrary continuous random variable.
1 code implementation • 17 May 2020 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky
We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA.
no code implementations • 6 May 2020 • Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Dung D. Vu, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua Bengio
We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS).
no code implementations • ICML 2020 • Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville
At each time step, the teacher is created by copying the student agent, before being finetuned to maximize task completion.
1 code implementation • 23 Mar 2020 • Sai Rajeswar, Fahim Mannan, Florian Golemo, Jérôme Parent-Lévesque, David Vazquez, Derek Nowrouzezahrai, Aaron Courville
We propose Pix2Shape, an approach to solve this problem with four components: (i) an encoder that infers the latent 3D representation from an image, (ii) a decoder that generates an explicit 2. 5D surfel-based reconstruction of a scene from the latent code (iii) a differentiable renderer that synthesizes a 2D image from the surfel representation, and (iv) a critic network trained to discriminate between images generated by the decoder-renderer and those from a training distribution.
4 code implementations • 2 Mar 2020 • David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, Aaron Courville
Distributional shift is one of the major obstacles when transferring machine learning prediction systems from the lab to the real world.
no code implementations • ICLR Workshop DeepDiffEq 2019 • Chin-wei Huang, Laurent Dinh, Aaron Courville
Normalizing flows are powerful invertible probabilistic models that can be used to translate two probability distributions, in a way that allows us to efficiently track the change of probability density.
1 code implementation • 17 Feb 2020 • Chin-wei Huang, Laurent Dinh, Aaron Courville
In this work, we propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood.
Ranked #6 on
Image Generation
on CelebA 256x256
no code implementations • ICLR 2020 • Adrien Ali Taiga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare
Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).
3 code implementations • 12 Dec 2019 • Dzmitry Bahdanau, Harm de Vries, Timothy J. O'Donnell, Shikhar Murty, Philippe Beaudoin, Yoshua Bengio, Aaron Courville
In this work, we study how systematic the generalization of such models is, that is to which extent they are capable of handling novel combinations of known linguistic constructs.
2 code implementations • 13 Nov 2019 • Sara Hooker, Aaron Courville, Gregory Clark, Yann Dauphin, Andrea Frome
However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.
1 code implementation • NeurIPS 2019 • Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron Courville
Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.
1 code implementation • 21 Oct 2019 • Shawn Tan, Guillaume Androz, Ahmad Chamseddine, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen
We release the largest public ECG dataset of continuous raw signals for representation learning containing 11 thousand patients and 2 billion labelled beats.
21 code implementations • NeurIPS 2019 • Kundan Kumar, Rithesh Kumar, Thibault de Boissiere, Lucas Gestin, Wei Zhen Teoh, Jose Sotelo, Alexandre de Brebisson, Yoshua Bengio, Aaron Courville
In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques.
no code implementations • 25 Sep 2019 • Sara Hooker, Yann Dauphin, Aaron Courville, Andrea Frome
Neural network pruning techniques have demonstrated it is possible to remove the majority of weights in a network with surprisingly little degradation to top-1 test set accuracy.
no code implementations • 25 Sep 2019 • Shawn Tan, Guillaume Androz, Ahmad Chamseddine, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen
We release the largest public ECG dataset of continuous raw signals for representation learning containing over 11k patients and 2 billion labelled beats.
no code implementations • 25 Sep 2019 • Michael Noukhovitch, Travis LaCroix, Aaron Courville
Current literature in machine learning holds that unaligned, self-interested agents do not learn to use an emergent communication channel.
1 code implementation • 4 Sep 2019 • Philip Paquette, Yuchen Lu, Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, Aaron Courville
Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal.
1 code implementation • 14 Aug 2019 • Cătălina Cangea, Eugene Belilovsky, Pietro Liò, Aaron Courville
The goal of this dataset is to assess question-answering performance from nearly-ideal navigation paths, while considering a much more complete variety of questions than current instantiations of the EQA task.
1 code implementation • 13 Aug 2019 • Faruk Ahmed, Aaron Courville
We critically appraise the recent interest in out-of-distribution (OOD) detection and question the practical relevance of existing benchmarks.
no code implementations • 6 Aug 2019 • Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare
This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE).
1 code implementation • 24 Jun 2019 • Jacob Leygonie, Jennifer She, Amjad Almahairi, Sai Rajeswar, Aaron Courville
We show that during training, our generator follows the $W_2$-geodesic between the initial and the target distributions.
no code implementations • 23 Jun 2019 • Shawn Tan, Yikang Shen, Chin-wei Huang, Aaron Courville
The ability to understand logical relationships between sentences is an important task in language understanding.
no code implementations • 10 Jun 2019 • Chin-wei Huang, Ahmed Touati, Pascal Vincent, Gintare Karolina Dziugaite, Alexandre Lacoste, Aaron Courville
Recent advances in variational inference enable the modelling of highly structured joint distributions, but are limited in their capacity to scale to the high-dimensional setting of stochastic neural networks.
1 code implementation • 9 Jun 2019 • Chin-wei Huang, Aaron Courville
In this note, we study the relationship between the variational gap and the variance of the (log) likelihood ratio.
no code implementations • 29 May 2019 • Mikołaj Bińkowski, R. Devon Hjelm, Aaron Courville
We also provide rigorous probabilistic setting for domain transfer and new simplified objective for training transfer networks, an alternative to complex, multi-component loss functions used in the current state-of-the art image-to-image translation models.
1 code implementation • 13 May 2019 • Chin-wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville
We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation.
no code implementations • ICLR 2019 • Samuel Lavoie-Marchildon, Sebastien Lachapelle, Mikołaj Bińkowski, Aaron Courville, Yoshua Bengio, R. Devon Hjelm
We perform completely unsupervised one-sided image to image translation between a source domain $X$ and a target domain $Y$ such that we preserve relevant underlying shared semantics (e. g., class, size, shape, etc).
no code implementations • ICLR 2019 • Rithesh Kumar, Anirudh Goyal, Aaron Courville, Yoshua Bengio
Unsupervised learning is about capturing dependencies between variables and is driven by the contrast between the probable vs improbable configurations of these variables, often either via a generative model which only samples probable ones or with an energy function (unnormalized log-density) which is low for probable ones and high for improbable ones.
1 code implementation • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio
Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes.
no code implementations • ICLR 2019 • Sai Rajeswar, Fahim Mannan, Florian Golemo, David Vazquez, Derek Nowrouzezahrai, Aaron Courville
Modelling 3D scenes from 2D images is a long-standing problem in computer vision with implications in, e. g., simulation and robotics.
1 code implementation • ICCV 2019 • Lluis Castrejon, Nicolas Ballas, Aaron Courville
To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models.
Ranked #2 on
Video Prediction
on Cityscapes 128x128
4 code implementations • 18 Mar 2019 • Cheng-Zhi Anna Huang, Tim Cooijmans, Adam Roberts, Aaron Courville, Douglas Eck
Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end.
Ranked #4 on
Music Modeling
on JSB Chorales
2 code implementations • 24 Jan 2019 • Rithesh Kumar, Sherjil Ozair, Anirudh Goyal, Aaron Courville, Yoshua Bengio
Maximum likelihood estimation of energy-based models is a challenging problem due to the intractability of the log-likelihood gradient.
1 code implementation • 4 Dec 2018 • Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau
In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map.
2 code implementations • ICLR 2019 • Dzmitry Bahdanau, Shikhar Murty, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville
Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantiated.
1 code implementation • 25 Nov 2018 • Johanna Hansen, Kyle Kastner, Aaron Courville, Gregory Dudek
We demonstrate the use of conditional autoregressive generative models (van den Oord et al., 2016a) over a discrete latent space (van den Oord et al., 2017b) for forward planning with MCTS.
1 code implementation • 18 Nov 2018 • Kyle Kastner, Rithesh Kumar, Tim Cooijmans, Aaron Courville
We demonstrate a conditional autoregressive pipeline for efficient music recomposition, based on methods presented in van den Oord et al.(2017).
no code implementations • 17 Nov 2018 • Kyle Kastner, João Felipe Santos, Yoshua Bengio, Aaron Courville
Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation.
1 code implementation • 12 Nov 2018 • Ankesh Anand, Eugene Belilovsky, Kyle Kastner, Hugo Larochelle, Aaron Courville
We explore blindfold (question-only) baselines for Embodied Question Answering.
7 code implementations • ICLR 2019 • Yikang Shen, Shawn Tan, Alessandro Sordoni, Aaron Courville
When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed.
no code implementations • 27 Sep 2018 • Leygonie Jacob*, Jennifer She*, Amjad Almahairi, Sai Rajeswar, Aaron Courville
In this work we address the converse question: is it possible to recover an optimal map in a GAN fashion?
no code implementations • 27 Sep 2018 • Remi Tachet des Combes, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio
While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.
no code implementations • 27 Sep 2018 • Chin-wei Huang, Faruk Ahmed, Kundan Kumar, Alexandre Lacoste, Aaron Courville
Probability distillation has recently been of interest to deep learning practitioners as it presents a practical solution for sampling from autoregressive models for deployment in real-time applications.
no code implementations • 18 Sep 2018 • Remi Tachet, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio
While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.
1 code implementation • NeurIPS 2018 • Chin-wei Huang, Shawn Tan, Alexandre Lacoste, Aaron Courville
Despite the advances in the representational capacity of approximate distributions for variational inference, the optimization process can still limit the density that is ultimately learned.
no code implementations • 29 Aug 2018 • Adrien Ali Taïga, Aaron Courville, Marc G. Bellemare
Next, we show how a given density model can be related to an abstraction and that the corresponding pseudo-count bonus can act as a substitute in MBIE-EB combined with this abstraction, but may lead to either under- or over-exploration.
1 code implementation • ECCV 2018 • Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin
Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.
no code implementations • ICML 2018 • Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, Devon Hjelm
We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks.
2 code implementations • ICLR 2019 • Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, Aaron Courville
Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100\%$ accuracy.
no code implementations • 18 Jun 2018 • Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, Aaron Courville
However, interestingly, the greater modeling power offered by the recurrent neural network appears to undermine the model's ability to act as a regularizer of the product representations.
12 code implementations • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio
Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples.
Ranked #19 on
Image Classification
on OmniBenchmark
2 code implementations • ACL 2018 • Yikang Shen, Zhouhan Lin, Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio
In this work, we propose a novel constituency parsing scheme.
5 code implementations • ICML 2018 • Chin-wei Huang, David Krueger, Alexandre Lacoste, Aaron Courville
Normalizing flows and autoregressive models have been successfully combined to produce state-of-the-art results in density estimation, via Masked Autoregressive Flows (MAF), and to accelerate state-of-the-art WaveNet-based speech synthesis to 20x faster than real-time, via Inverse Autoregressive Flows (IAF).
no code implementations • 7 Mar 2018 • Yikang Shen, Shawn Tan, Chin-wei Huang, Aaron Courville
Learning distributed sentence representations remains an interesting problem in the field of Natural Language Processing (NLP).
3 code implementations • ICML 2018 • Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville
Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by reducing the need for paired data.
no code implementations • ICLR 2018 • Mohamed Ishmael Belghazi, Sai Rajeswar, Olivier Mastropietro, Negar Rostamzadeh, Jovana Mitrovic, Aaron Courville
We propose a novel hierarchical generative model with a simple Markovian structure and a corresponding inference model.
19 code implementations • 12 Jan 2018 • Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, R. Devon Hjelm
We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks.
no code implementations • ICLR 2018 • Brady Neal, Alex Lamb, Sherjil Ozair, Devon Hjelm, Aaron Courville, Yoshua Bengio, Ioannis Mitliagkas
One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler generation tasks.
no code implementations • NeurIPS 2017 • Alex Lamb, Devon Hjelm, Yaroslav Ganin, Joseph Paul Cohen, Aaron Courville, Yoshua Bengio
Directed latent variable models that formulate the joint distribution as $p(x, z) = p(z) p(x \mid z)$ have the advantage of fast and exact sampling.
no code implementations • 29 Nov 2017 • Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville
We introduce HoME: a Household Multimodal Environment for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context.
1 code implementation • ICLR 2018 • Yikang Shen, Zhouhan Lin, Chin-wei Huang, Aaron Courville
In this paper, We propose a novel neural language model, called the Parsing-Reading-Predict Networks (PRPN), that can simultaneously induce the syntactic structure from unannotated sentences and leverage the inferred structure to learn a better language model.
Ranked #12 on
Constituency Grammar Induction
on PTB Diagnostic ECG Database
(Max F1 (WSJ) metric)
no code implementations • ICLR 2018 • David Krueger, Chin-wei Huang, Riashat Islam, Ryan Turner, Alexandre Lacoste, Aaron Courville
We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks.
no code implementations • 6 Oct 2017 • Chin-wei Huang, Ahmed Touati, Laurent Dinh, Michal Drozdzal, Mohammad Havaei, Laurent Charlin, Aaron Courville
In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its corresponding posterior.
5 code implementations • 22 Sep 2017 • Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville
We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation.
Ranked #3 on
Visual Question Answering (VQA)
on CLEVR-Humans
Image Retrieval with Multi-Modal Query
Visual Question Answering (VQA)
+1
no code implementations • 26 Jul 2017 • Yikang Shen, Shawn Tan, Chrisopher Pal, Aaron Courville
We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies.
2 code implementations • 10 Jul 2017 • Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville
Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.
3 code implementations • NeurIPS 2017 • Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron Courville
It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected.
2 code implementations • ICML 2017 • Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, Simon Lacoste-Julien
We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness.
no code implementations • WS 2017 • Sai Rajeswar, Sandeep Subramanian, Francis Dutil, Christopher Pal, Aaron Courville
Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation.
110 code implementations • NeurIPS 2017 • Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville
Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability.
Ranked #3 on
Image Generation
on CAT 256x256
2 code implementations • 15 Mar 2017 • Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin
End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning.
1 code implementation • 6 Feb 2017 • Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville
In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal.
Ranked #17 on
Conditional Image Generation
on CIFAR-10
(Inception score metric)
1 code implementation • 10 Jan 2017 • Ying Zhang, Mohammad Pezeshki, Philemon Brakel, Saizheng Zhang, Cesar Laurent Yoshua Bengio, Aaron Courville
Meanwhile, Connectionist Temporal Classification (CTC) with Recurrent Neural Networks (RNNs), which is proposed for labeling unsegmented sequences, makes it feasible to train an end-to-end speech recognition system instead of hybrid settings.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
4 code implementations • 22 Dec 2016 • Soroush Mehri, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo, Aaron Courville, Yoshua Bengio
In this paper we propose a novel model for unconditional audio generation based on generating one audio sample at a time.
no code implementations • 12 Dec 2016 • Mehdi Mirza, Aaron Courville, Yoshua Bengio
In this work, we explore the potential of unsupervised learning to find features that promote better generalization to settings outside the supervised training distribution.
2 code implementations • 2 Dec 2016 • David Vázquez, Jorge Bernal, F. Javier Sánchez, Gloria Fernández-Esparrach, Antonio M. López, Adriana Romero, Michal Drozdzal, Aaron Courville
Colorectal cancer (CRC) is the third cause of cancer death worldwide.
2 code implementations • EMNLP (ACL) 2017 • Iulian V. Serban, Alexander G. Ororbia II, Joelle Pineau, Aaron Courville
Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variables, such as variational autoencoders.
2 code implementations • CVPR 2017 • Tegan Maharaj, Nicolas Ballas, Anna Rohrbach, Aaron Courville, Christopher Pal
In addition to presenting statistics and a description of the dataset, we perform a detailed analysis of 5 different models' predictions, and compare these with human performance.
4 code implementations • CVPR 2017 • Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville
Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.
1 code implementation • 15 Nov 2016 • Ishaan Gulrajani, Kundan Kumar, Faruk Ahmed, Adrien Ali Taiga, Francesco Visin, David Vazquez, Aaron Courville
Natural image modeling is a landmark challenge of unsupervised learning.
1 code implementation • NeurIPS 2016 • Alex Lamb, Anirudh Goyal, Ying Zhang, Saizheng Zhang, Aaron Courville, Yoshua Bengio
We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps.
3 code implementations • 24 Jul 2016 • Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio
We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL).
Ranked #8 on
Machine Translation
on IWSLT2015 English-German
no code implementations • 8 Jun 2016 • Amjad Almahairi, Kyunghyun Cho, Nizar Habash, Aaron Courville
Neural machine translation has become a major alternative to widely used phrase-based statistical machine translation.
6 code implementations • 3 Jun 2016 • David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Aaron Courville, Chris Pal
We propose zoneout, a novel method for regularizing RNNs.
9 code implementations • 2 Jun 2016 • Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville
We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an adversarial process.
4 code implementations • 2 Jun 2016 • Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bo-Wen Zhou, Yoshua Bengio, Aaron Courville
We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens.
Ranked #1 on
Dialogue Generation
on Ubuntu Dialogue (Activity)
9 code implementations • 19 May 2016 • Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio
Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue.
no code implementations • 12 May 2016 • Anna Rohrbach, Atousa Torabi, Marcus Rohrbach, Niket Tandon, Christopher Pal, Hugo Larochelle, Aaron Courville, Bernt Schiele
In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions.
1 code implementation • 9 May 2016 • The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang
Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.
3 code implementations • 30 Mar 2016 • Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville
We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks.
Ranked #20 on
Language Modelling
on Text8
1 code implementation • ACL 2016 • Iulian Vlad Serban, Alberto García-Durán, Caglar Gulcehre, Sungjin Ahn, Sarath Chandar, Aaron Courville, Yoshua Bengio
Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances.
1 code implementation • 9 Feb 2016 • Alex Lamb, Vincent Dumoulin, Aaron Courville
We propose to take advantage of this by using the representations from discriminative classifiers to augment the objective function corresponding to a generative model.
2 code implementations • 24 Nov 2015 • Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville
The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks.
2 code implementations • 22 Nov 2015 • Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville
Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features.
Ranked #17 on
Semantic Segmentation
on CamVid
1 code implementation • 20 Nov 2015 • Guillaume Alain, Alex Lamb, Chinnadhurai Sankar, Aaron Courville, Yoshua Bengio
This leads the model to update using an unbiased estimate of the gradient which also has minimum variance when the sampling proposal is proportional to the L2-norm of the gradient.
2 code implementations • 19 Nov 2015 • Nicolas Ballas, Li Yao, Chris Pal, Aaron Courville
We propose an approach to learn spatio-temporal features in videos from intermediate visual representations we call "percepts" using Gated-Recurrent-Unit Recurrent Networks (GRUs). Our method relies on percepts that are extracted from all level of a deep convolutional network trained on the large ImageNet dataset.
no code implementations • 19 Nov 2015 • Marcin Moczulski, Kelvin Xu, Aaron Courville, Kyunghyun Cho
Recently there has been growing interest in building active visual object recognizers, as opposed to the usual passive recognizers which classifies a given static image into a predefined set of object categories.
1 code implementation • 19 Nov 2015 • Dzmitry Bahdanau, Dmitriy Serdyuk, Philémon Brakel, Nan Rosemary Ke, Jan Chorowski, Aaron Courville, Yoshua Bengio
Our idea is that this score can be interpreted as an estimate of the task loss, and that the estimation error may be used as a consistent surrogate loss.