You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

1 code implementation • ACL 2022 • Yikang Shen, Shawn Tan, Alessandro Sordoni, Peng Li, Jie zhou, Aaron Courville

We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task.

1 code implementation • 16 May 2022 • Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville

This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later.

Ranked #2 on Atari Games 100k on Atari 100k

1 code implementation • 1 Apr 2022 • Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Kenji Kawaguchi, Ankit Vani, Aaron Courville

Specifically, we show that the temperature $\tau$ of the Softmax operation controls for the SEM representation's expressivity, allowing us to derive a tighter downstream classifier generalization bound than that for classifiers using unnormalized representations.

no code implementations • 1st International Workshop on Practical Deep Learning in the Wild, Association for the Advancement of Artificial Intelligence (AAAI) 2022 • Lluis Castrejon, Nicolas Ballas, Aaron Courville

Inspired by this we propose a cascaded model for video generation which follows a coarse to fine approach.

1 code implementation • 3 Feb 2022 • Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Volokhova, Aaron Courville, Yoshua Bengio

We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data.

1 code implementation • ICLR 2022 • Hattie Zhou, Ankit Vani, Hugo Larochelle, Aaron Courville

Forgetting is often seen as an unwanted characteristic in both human and machine learning.

no code implementations • 18 Jan 2022 • Taoli Cheng, Aaron Courville

We leverage representation learning and the inductive bias in neural-net-based Standard Model jet classification tasks, to detect non-QCD signal jets.

1 code implementation • ICLR 2022 • Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi Anna Huang, Jesse Engel

Musical expression requires control of both what notes are played, and how they are performed.

no code implementations • ICLR 2022 • Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine

In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations.

1 code implementation • 23 Nov 2021 • Sai Rajeswar, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville

We also show that MILe is effective reducing label noise, achieving state-of-the-art performance on real-world large-scale noisy data such as WebVision.

Ranked #3 on Image Classification on WebVision-1000

1 code implementation • ICLR 2022 • Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio

We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression.

no code implementations • ICLR 2022 • Dinghuai Zhang, Jie Fu, Yoshua Bengio, Aaron Courville

Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on the pharmaceutical industry.

no code implementations • ICLR 2022 • Shawn Tan, Chin-wei Huang, Alessandro Sordoni, Aaron Courville

Addtionally, since the support of the marginal $q(z)$ is bounded and the support of prior $p(z)$ is not, we propose renormalising the prior distribution over the support of $q(z)$.

no code implementations • 29 Sep 2021 • Yuchen Lu, Zhen Liu, Alessandro Sordoni, Aristide Baratin, Romain Laroche, Aaron Courville

In this work, we argue that representations induced by self-supervised learning (SSL) methods should both be expressive and learnable.

no code implementations • 29 Sep 2021 • Lluis Castrejon, Nicolas Ballas, Aaron Courville

Each object representation defines a localized neural radiance field that is used to generate 2D views of the scene through a differentiable rendering process.

no code implementations • 29 Sep 2021 • Siyuan Zhou, Yikang Shen, Yuchen Lu, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

With the isolation of information and the synchronous calling mechanism, we can impose a division of works between the controller and options in an end-to-end training regime.

no code implementations • 29 Sep 2021 • Sai Rajeswar Mudumba, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville

This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data.

no code implementations • 22 Sep 2021 • Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

1 code implementation • NeurIPS 2021 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs.

1 code implementation • NeurIPS 2021 • Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville

Data efficiency is a key challenge for deep reinforcement learning.

1 code implementation • NeurIPS 2021 • Chin-wei Huang, Jae Hyun Lim, Aaron Courville

Under this framework, we show that minimizing the score-matching loss is equivalent to maximizing a lower bound of the likelihood of the plug-in reverse SDE proposed by Song et al. (2021), bridging the theoretical gap.

no code implementations • 5 Jun 2021 • Dinghuai Zhang, Kartik Ahuja, Yilun Xu, Yisen Wang, Aaron Courville

Can models with particular structure avoid being biased towards spurious correlation in out-of-distribution (OOD) generalization?

no code implementations • 4 Jun 2021 • Lluis Castrejon, Nicolas Ballas, Aaron Courville

Inspired by this we propose a hierarchical model for video generation which follows a coarse to fine approach.

1 code implementation • NAACL 2021 • Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, R Devon Hjelm, Alessandro Sordoni, Aaron Courville

To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus.

no code implementations • ICLR 2021 • Ankit Vani, Max Schwarzer, Yuchen Lu, Eeshan Dhekane, Aaron Courville

Although neural module networks have an architectural bias towards compositionality, they require gold standard layouts to generalize systematically in practice.

1 code implementation • 1 Apr 2021 • Sai Rajeswar, Cyril Ibrahim, Nitin Surya, Florian Golemo, David Vazquez, Aaron Courville, Pedro O. Pinheiro

Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks that involve contact-rich motion.

no code implementations • 19 Mar 2021 • Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

The discovered subtask hierarchy could be used to perform task decomposition, recovering the subtask boundaries in an unstruc-tured demonstration.

no code implementations • ICLR Workshop SSL-RL 2021 • Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, R Devon Hjelm, Philip Bachman, Aaron Courville

Data efficiency poses a major challenge for deep reinforcement learning.

2 code implementations • 4 Mar 2021 • Hadi Nekoei, Akilesh Badrinaaraayanan, Aaron Courville, Sarath Chandar

Its large strategy space makes it a desirable environment for lifelong RL tasks.

1 code implementation • 25 Jan 2021 • Michael Noukhovitch, Travis LaCroix, Angeliki Lazaridou, Aaron Courville

First, we show that communication is proportional to cooperation, and it can occur for partially competitive scenarios using standard learning algorithms.

no code implementations • ICLR 2021 • Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville

We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features.

no code implementations • 1 Jan 2021 • Lluis Castrejon, Nicolas Ballas, Aaron Courville

Current state-of-the-art generative models for videos have high computational requirements that impede high resolution generations beyond a few frames.

no code implementations • ICLR 2021 • Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

Many complex real-world tasks are composed of several levels of sub-tasks.

no code implementations • ICLR 2021 • Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for likelihood-free inference where the evaluation of likelihood function is intractable but sampling / simulating data from the model is possible.

1 code implementation • ICLR 2021 • Chin-wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville

Flow-based models are powerful tools for designing probabilistic models with tractable density.

1 code implementation • ACL 2021 • Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville

There are two major classes of natural language grammar -- the dependency grammar that models one-to-one correspondences between words and the constituency grammar that models the assembly of one or several corresponded words.

no code implementations • pproximateinference AABI Symposium 2021 • Jae Hyun Lim, Chin-wei Huang, Aaron Courville, Christopher Pal

In this work, we propose Bijective-Contrastive Estimation (BCE), a classification-based learning criterion for energy-based models.

2 code implementations • NeurIPS 2021 • Mohammad Pezeshki, Sékou-Oumar Kaba, Yoshua Bengio, Aaron Courville, Doina Precup, Guillaume Lajoie

We identify and formalize a fundamental gradient descent phenomenon resulting in a learning proclivity in over-parameterized neural networks.

no code implementations • NeurIPS 2020 • Pedro O. Pinheiro, Amjad Almahairi, Ryan Y. Benmalek, Florian Golemo, Aaron Courville

VADeR provides a natural representation for dense prediction tasks and transfers well to downstream tasks.

no code implementations • 22 Oct 2020 • Rithesh Kumar, Kundan Kumar, Vicki Anand, Yoshua Bengio, Aaron Courville

In this paper, we propose NU-GAN, a new method for resampling audio from lower to higher sampling rates (upsampling).

no code implementations • NAACL 2021 • Yikang Shen, Shawn Tan, Alessandro Sordoni, Siva Reddy, Aaron Courville

In the present work, we propose a new syntax-aware language model: Syntactic Ordered Memory (SOM).

no code implementations • 20 Oct 2020 • Yanzhi Chen, Dinghuai Zhang, Michael Gutmann, Aaron Courville, Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation of the likelihood function is intractable, but sampling data from the model is possible.

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Shawn Tan, Yikang Shen, Timothy J. O'Donnell, Alessandro Sordoni, Aaron Courville

We model the recursive production property of context-free grammars for natural and synthetic languages.

no code implementations • EMNLP 2020 • Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

Language drift has been one of the major obstacles to train language models through interaction.

1 code implementation • ICLR 2021 • Samuel Lavoie, Faruk Ahmed, Aaron Courville

While unsupervised domain translation (UDT) has seen a lot of success recently, we argue that mediating its translation via categorical semantic features could broaden its applicability.

1 code implementation • ICLR 2021 • Max Schwarzer, Ankesh Anand, Rishab Goel, R. Devon Hjelm, Aaron Courville, Philip Bachman

We further improve performance by adding data augmentation to the future prediction loss, which forces the agent's representations to be consistent across multiple views of an observation.

Ranked #3 on Atari Games 100k on Atari 100k

1 code implementation • ICCV 2021 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky

However, test images might contain zero- and few-shot compositions of objects and relationships, e. g. <cup, on, surfboard>.

2 code implementations • ICML 2020 • Jae Hyun Lim, Aaron Courville, Christopher Pal, Chin-wei Huang

Entropy is ubiquitous in machine learning, but it is in general intractable to compute the entropy of the distribution of an arbitrary continuous random variable.

1 code implementation • 17 May 2020 • Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky

We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA.

no code implementations • 6 May 2020 • Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Dung D. Vu, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua Bengio

We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS).

no code implementations • ICML 2020 • Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

At each time step, the teacher is created by copying the student agent, before being finetuned to maximize task completion.

1 code implementation • 23 Mar 2020 • Sai Rajeswar, Fahim Mannan, Florian Golemo, Jérôme Parent-Lévesque, David Vazquez, Derek Nowrouzezahrai, Aaron Courville

We propose Pix2Shape, an approach to solve this problem with four components: (i) an encoder that infers the latent 3D representation from an image, (ii) a decoder that generates an explicit 2. 5D surfel-based reconstruction of a scene from the latent code (iii) a differentiable renderer that synthesizes a 2D image from the surfel representation, and (iv) a critic network trained to discriminate between images generated by the decoder-renderer and those from a training distribution.

3 code implementations • 2 Mar 2020 • David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, Aaron Courville

Distributional shift is one of the major obstacles when transferring machine learning prediction systems from the lab to the real world.

no code implementations • ICLR Workshop DeepDiffEq 2019 • Chin-wei Huang, Laurent Dinh, Aaron Courville

Normalizing flows are powerful invertible probabilistic models that can be used to translate two probability distributions, in a way that allows us to efficiently track the change of probability density.

1 code implementation • 17 Feb 2020 • Chin-wei Huang, Laurent Dinh, Aaron Courville

In this work, we propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood.

Ranked #6 on Image Generation on CelebA 256x256

no code implementations • ICLR 2020 • Adrien Ali Taiga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

3 code implementations • 12 Dec 2019 • Dzmitry Bahdanau, Harm de Vries, Timothy J. O'Donnell, Shikhar Murty, Philippe Beaudoin, Yoshua Bengio, Aaron Courville

In this work, we study how systematic the generalization of such models is, that is to which extent they are capable of handling novel combinations of known linguistic constructs.

2 code implementations • 13 Nov 2019 • Sara Hooker, Aaron Courville, Gregory Clark, Yann Dauphin, Andrea Frome

However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.

1 code implementation • NeurIPS 2019 • Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron Courville

Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.

1 code implementation • 21 Oct 2019 • Shawn Tan, Guillaume Androz, Ahmad Chamseddine, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen

We release the largest public ECG dataset of continuous raw signals for representation learning containing 11 thousand patients and 2 billion labelled beats.

21 code implementations • NeurIPS 2019 • Kundan Kumar, Rithesh Kumar, Thibault de Boissiere, Lucas Gestin, Wei Zhen Teoh, Jose Sotelo, Alexandre de Brebisson, Yoshua Bengio, Aaron Courville

In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques.

no code implementations • 25 Sep 2019 • Shawn Tan, Guillaume Androz, Ahmad Chamseddine, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen

We release the largest public ECG dataset of continuous raw signals for representation learning containing over 11k patients and 2 billion labelled beats.

no code implementations • 25 Sep 2019 • Sara Hooker, Yann Dauphin, Aaron Courville, Andrea Frome

Neural network pruning techniques have demonstrated it is possible to remove the majority of weights in a network with surprisingly little degradation to top-1 test set accuracy.

no code implementations • 25 Sep 2019 • Michael Noukhovitch, Travis LaCroix, Aaron Courville

Current literature in machine learning holds that unaligned, self-interested agents do not learn to use an emergent communication channel.

1 code implementation • 4 Sep 2019 • Philip Paquette, Yuchen Lu, Steven Bocco, Max O. Smith, Satya Ortiz-Gagne, Jonathan K. Kummerfeld, Satinder Singh, Joelle Pineau, Aaron Courville

Diplomacy is a seven-player non-stochastic, non-cooperative game, where agents acquire resources through a mix of teamwork and betrayal.

1 code implementation • 14 Aug 2019 • Cătălina Cangea, Eugene Belilovsky, Pietro Liò, Aaron Courville

The goal of this dataset is to assess question-answering performance from nearly-ideal navigation paths, while considering a much more complete variety of questions than current instantiations of the EQA task.

1 code implementation • 13 Aug 2019 • Faruk Ahmed, Aaron Courville

We critically appraise the recent interest in out-of-distribution (OOD) detection and question the practical relevance of existing benchmarks.

no code implementations • 6 Aug 2019 • Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE).

1 code implementation • 24 Jun 2019 • Jacob Leygonie, Jennifer She, Amjad Almahairi, Sai Rajeswar, Aaron Courville

We show that during training, our generator follows the $W_2$-geodesic between the initial and the target distributions.

no code implementations • 23 Jun 2019 • Shawn Tan, Yikang Shen, Chin-wei Huang, Aaron Courville

The ability to understand logical relationships between sentences is an important task in language understanding.

no code implementations • 10 Jun 2019 • Chin-wei Huang, Ahmed Touati, Pascal Vincent, Gintare Karolina Dziugaite, Alexandre Lacoste, Aaron Courville

Recent advances in variational inference enable the modelling of highly structured joint distributions, but are limited in their capacity to scale to the high-dimensional setting of stochastic neural networks.

1 code implementation • 9 Jun 2019 • Chin-wei Huang, Aaron Courville

In this note, we study the relationship between the variational gap and the variance of the (log) likelihood ratio.

no code implementations • 29 May 2019 • Mikołaj Bińkowski, R. Devon Hjelm, Aaron Courville

We also provide rigorous probabilistic setting for domain transfer and new simplified objective for training transfer networks, an alternative to complex, multi-component loss functions used in the current state-of-the art image-to-image translation models.

1 code implementation • 13 May 2019 • Chin-wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville

We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation.

no code implementations • ICLR 2019 • Sai Rajeswar, Fahim Mannan, Florian Golemo, David Vazquez, Derek Nowrouzezahrai, Aaron Courville

Modelling 3D scenes from 2D images is a long-standing problem in computer vision with implications in, e. g., simulation and robotics.

1 code implementation • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio

Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes.

no code implementations • ICLR 2019 • Samuel Lavoie-Marchildon, Sebastien Lachapelle, Mikołaj Bińkowski, Aaron Courville, Yoshua Bengio, R. Devon Hjelm

We perform completely unsupervised one-sided image to image translation between a source domain $X$ and a target domain $Y$ such that we preserve relevant underlying shared semantics (e. g., class, size, shape, etc).

no code implementations • ICLR 2019 • Rithesh Kumar, Anirudh Goyal, Aaron Courville, Yoshua Bengio

Unsupervised learning is about capturing dependencies between variables and is driven by the contrast between the probable vs improbable configurations of these variables, often either via a generative model which only samples probable ones or with an energy function (unnormalized log-density) which is low for probable ones and high for improbable ones.

1 code implementation • ICCV 2019 • Lluis Castrejon, Nicolas Ballas, Aaron Courville

To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models.

Ranked #2 on Video Prediction on Cityscapes 128x128

3 code implementations • 18 Mar 2019 • Cheng-Zhi Anna Huang, Tim Cooijmans, Adam Roberts, Aaron Courville, Douglas Eck

Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end.

Ranked #4 on Music Modeling on JSB Chorales

2 code implementations • 24 Jan 2019 • Rithesh Kumar, Sherjil Ozair, Anirudh Goyal, Aaron Courville, Yoshua Bengio

Maximum likelihood estimation of energy-based models is a challenging problem due to the intractability of the log-likelihood gradient.

1 code implementation • 4 Dec 2018 • Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau

In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map.

2 code implementations • ICLR 2019 • Dzmitry Bahdanau, Shikhar Murty, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville

Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantiated.

1 code implementation • 25 Nov 2018 • Johanna Hansen, Kyle Kastner, Aaron Courville, Gregory Dudek

We demonstrate the use of conditional autoregressive generative models (van den Oord et al., 2016a) over a discrete latent space (van den Oord et al., 2017b) for forward planning with MCTS.

1 code implementation • 18 Nov 2018 • Kyle Kastner, Rithesh Kumar, Tim Cooijmans, Aaron Courville

We demonstrate a conditional autoregressive pipeline for efficient music recomposition, based on methods presented in van den Oord et al.(2017).

no code implementations • 17 Nov 2018 • Kyle Kastner, João Felipe Santos, Yoshua Bengio, Aaron Courville

Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation.

1 code implementation • 12 Nov 2018 • Ankesh Anand, Eugene Belilovsky, Kyle Kastner, Hugo Larochelle, Aaron Courville

We explore blindfold (question-only) baselines for Embodied Question Answering.

7 code implementations • ICLR 2019 • Yikang Shen, Shawn Tan, Alessandro Sordoni, Aaron Courville

When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed.

Ranked #8 on Constituency Grammar Induction on PTB

no code implementations • 27 Sep 2018 • Chin-wei Huang, Faruk Ahmed, Kundan Kumar, Alexandre Lacoste, Aaron Courville

Probability distillation has recently been of interest to deep learning practitioners as it presents a practical solution for sampling from autoregressive models for deployment in real-time applications.

no code implementations • 27 Sep 2018 • Remi Tachet des Combes, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio

While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.

no code implementations • 27 Sep 2018 • Leygonie Jacob*, Jennifer She*, Amjad Almahairi, Sai Rajeswar, Aaron Courville

In this work we address the converse question: is it possible to recover an optimal map in a GAN fashion?

no code implementations • 18 Sep 2018 • Remi Tachet, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio

While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.

1 code implementation • NeurIPS 2018 • Chin-wei Huang, Shawn Tan, Alexandre Lacoste, Aaron Courville

Despite the advances in the representational capacity of approximate distributions for variational inference, the optimization process can still limit the density that is ultimately learned.

no code implementations • 29 Aug 2018 • Adrien Ali Taïga, Aaron Courville, Marc G. Bellemare

Next, we show how a given density model can be related to an abstraction and that the corresponding pseudo-count bonus can act as a substitute in MBIE-EB combined with this abstraction, but may lead to either under- or over-exploration.

1 code implementation • ECCV 2018 • Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

no code implementations • ICML 2018 • Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, Devon Hjelm

We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks.

2 code implementations • ICLR 2019 • Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, Aaron Courville

Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100\%$ accuracy.

no code implementations • 18 Jun 2018 • Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, Aaron Courville

However, interestingly, the greater modeling power offered by the recurrent neural network appears to undermine the model's ability to act as a regularizer of the product representations.

11 code implementations • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples.

Ranked #65 on Image Classification on CIFAR-10

2 code implementations • ACL 2018 • Yikang Shen, Zhouhan Lin, Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio

In this work, we propose a novel constituency parsing scheme.

4 code implementations • ICML 2018 • Chin-wei Huang, David Krueger, Alexandre Lacoste, Aaron Courville

Normalizing flows and autoregressive models have been successfully combined to produce state-of-the-art results in density estimation, via Masked Autoregressive Flows (MAF), and to accelerate state-of-the-art WaveNet-based speech synthesis to 20x faster than real-time, via Inverse Autoregressive Flows (IAF).

no code implementations • 7 Mar 2018 • Yikang Shen, Shawn Tan, Chin-wei Huang, Aaron Courville

Learning distributed sentence representations remains an interesting problem in the field of Natural Language Processing (NLP).

3 code implementations • ICML 2018 • Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville

Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by reducing the need for paired data.

no code implementations • ICLR 2018 • Mohamed Ishmael Belghazi, Sai Rajeswar, Olivier Mastropietro, Negar Rostamzadeh, Jovana Mitrovic, Aaron Courville

We propose a novel hierarchical generative model with a simple Markovian structure and a corresponding inference model.

18 code implementations • 12 Jan 2018 • Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, R. Devon Hjelm

We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks.

no code implementations • ICLR 2018 • Brady Neal, Alex Lamb, Sherjil Ozair, Devon Hjelm, Aaron Courville, Yoshua Bengio, Ioannis Mitliagkas

One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler generation tasks.

no code implementations • NeurIPS 2017 • Alex Lamb, Devon Hjelm, Yaroslav Ganin, Joseph Paul Cohen, Aaron Courville, Yoshua Bengio

Directed latent variable models that formulate the joint distribution as $p(x, z) = p(z) p(x \mid z)$ have the advantage of fast and exact sampling.

no code implementations • 29 Nov 2017 • Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville

We introduce HoME: a Household Multimodal Environment for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context.

1 code implementation • ICLR 2018 • Yikang Shen, Zhouhan Lin, Chin-wei Huang, Aaron Courville

In this paper, We propose a novel neural language model, called the Parsing-Reading-Predict Networks (PRPN), that can simultaneously induce the syntactic structure from unannotated sentences and leverage the inferred structure to learn a better language model.

Ranked #11 on Constituency Grammar Induction on PTB (Max F1 (WSJ) metric)

no code implementations • ICLR 2018 • David Krueger, Chin-wei Huang, Riashat Islam, Ryan Turner, Alexandre Lacoste, Aaron Courville

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks.

no code implementations • 6 Oct 2017 • Chin-wei Huang, Ahmed Touati, Laurent Dinh, Michal Drozdzal, Mohammad Havaei, Laurent Charlin, Aaron Courville

In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its corresponding posterior.

4 code implementations • 22 Sep 2017 • Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville

We introduce a general-purpose conditioning method for neural networks called FiLM: Feature-wise Linear Modulation.

Ranked #3 on Visual Question Answering on CLEVR-Humans

Image Retrieval with Multi-Modal Query
Visual Question Answering
**+1**

no code implementations • 26 Jul 2017 • Yikang Shen, Shawn Tan, Chrisopher Pal, Aaron Courville

We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies.

2 code implementations • 10 Jul 2017 • Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville

Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.

3 code implementations • NeurIPS 2017 • Harm de Vries, Florian Strub, Jérémie Mary, Hugo Larochelle, Olivier Pietquin, Aaron Courville

It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected.

2 code implementations • ICML 2017 • Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, Simon Lacoste-Julien

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness.

no code implementations • WS 2017 • Sai Rajeswar, Sandeep Subramanian, Francis Dutil, Christopher Pal, Aaron Courville

Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation.

98 code implementations • NeurIPS 2017 • Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville

Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability.

Ranked #3 on Image Generation on CAT 256x256

2 code implementations • 15 Mar 2017 • Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning.

1 code implementation • 6 Feb 2017 • Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal.

Ranked #17 on Conditional Image Generation on CIFAR-10

1 code implementation • 10 Jan 2017 • Ying Zhang, Mohammad Pezeshki, Philemon Brakel, Saizheng Zhang, Cesar Laurent Yoshua Bengio, Aaron Courville

Meanwhile, Connectionist Temporal Classification (CTC) with Recurrent Neural Networks (RNNs), which is proposed for labeling unsegmented sequences, makes it feasible to train an end-to-end speech recognition system instead of hybrid settings.

3 code implementations • 22 Dec 2016 • Soroush Mehri, Kundan Kumar, Ishaan Gulrajani, Rithesh Kumar, Shubham Jain, Jose Sotelo, Aaron Courville, Yoshua Bengio

In this paper we propose a novel model for unconditional audio generation based on generating one audio sample at a time.

no code implementations • 12 Dec 2016 • Mehdi Mirza, Aaron Courville, Yoshua Bengio

In this work, we explore the potential of unsupervised learning to find features that promote better generalization to settings outside the supervised training distribution.

2 code implementations • 2 Dec 2016 • David Vázquez, Jorge Bernal, F. Javier Sánchez, Gloria Fernández-Esparrach, Antonio M. López, Adriana Romero, Michal Drozdzal, Aaron Courville

Colorectal cancer (CRC) is the third cause of cancer death worldwide.

2 code implementations • EMNLP (ACL) 2017 • Iulian V. Serban, Alexander G. Ororbia II, Joelle Pineau, Aaron Courville

Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variables, such as variational autoencoders.

3 code implementations • CVPR 2017 • Harm de Vries, Florian Strub, Sarath Chandar, Olivier Pietquin, Hugo Larochelle, Aaron Courville

Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.

2 code implementations • CVPR 2017 • Tegan Maharaj, Nicolas Ballas, Anna Rohrbach, Aaron Courville, Christopher Pal

In addition to presenting statistics and a description of the dataset, we perform a detailed analysis of 5 different models' predictions, and compare these with human performance.

1 code implementation • 15 Nov 2016 • Ishaan Gulrajani, Kundan Kumar, Faruk Ahmed, Adrien Ali Taiga, Francesco Visin, David Vazquez, Aaron Courville

Natural image modeling is a landmark challenge of unsupervised learning.

1 code implementation • NeurIPS 2016 • Alex Lamb, Anirudh Goyal, Ying Zhang, Saizheng Zhang, Aaron Courville, Yoshua Bengio

We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps.

3 code implementations • 24 Jul 2016 • Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio

We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL).

Ranked #8 on Machine Translation on IWSLT2015 English-German

no code implementations • 8 Jun 2016 • Amjad Almahairi, Kyunghyun Cho, Nizar Habash, Aaron Courville

Neural machine translation has become a major alternative to widely used phrase-based statistical machine translation.

5 code implementations • 3 Jun 2016 • David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Aaron Courville, Chris Pal

We propose zoneout, a novel method for regularizing RNNs.

7 code implementations • 2 Jun 2016 • Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville

We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an adversarial process.

4 code implementations • 2 Jun 2016 • Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bo-Wen Zhou, Yoshua Bengio, Aaron Courville

We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens.

Ranked #1 on Dialogue Generation on Ubuntu Dialogue (Activity)

9 code implementations • 19 May 2016 • Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio

Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue.

no code implementations • 12 May 2016 • Anna Rohrbach, Atousa Torabi, Marcus Rohrbach, Niket Tandon, Christopher Pal, Hugo Larochelle, Aaron Courville, Bernt Schiele

In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions.

1 code implementation • 9 May 2016 • The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

3 code implementations • 30 Mar 2016 • Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville

We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks.

Ranked #19 on Language Modelling on Text8

1 code implementation • ACL 2016 • Iulian Vlad Serban, Alberto García-Durán, Caglar Gulcehre, Sungjin Ahn, Sarath Chandar, Aaron Courville, Yoshua Bengio

Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances.

1 code implementation • 9 Feb 2016 • Alex Lamb, Vincent Dumoulin, Aaron Courville

We propose to take advantage of this by using the representations from discriminative classifiers to augment the objective function corresponding to a generative model.

1 code implementation • 24 Nov 2015 • Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville

The low-capacity sub-networks are applied across most of the input, but also provide a guide to select a few portions of the input on which to apply the high-capacity sub-networks.

2 code implementations • 22 Nov 2015 • Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville

Moreover, ReNet layers are stacked on top of pre-trained convolutional layers, benefiting from generic local features.

Ranked #15 on Semantic Segmentation on CamVid

1 code implementation • 20 Nov 2015 • Guillaume Alain, Alex Lamb, Chinnadhurai Sankar, Aaron Courville, Yoshua Bengio

This leads the model to update using an unbiased estimate of the gradient which also has minimum variance when the sampling proposal is proportional to the L2-norm of the gradient.

no code implementations • 19 Nov 2015 • Marcin Moczulski, Kelvin Xu, Aaron Courville, Kyunghyun Cho

Recently there has been growing interest in building active visual object recognizers, as opposed to the usual passive recognizers which classifies a given static image into a predefined set of object categories.

no code implementations • 19 Nov 2015 • Mohammad Pezeshki, Linxi Fan, Philemon Brakel, Aaron Courville, Yoshua Bengio

Although the empirical results are impressive, the Ladder Network has many components intertwined, whose contributions are not obvious in such a complex architecture.

2 code implementations • 19 Nov 2015 • Nicolas Ballas, Li Yao, Chris Pal, Aaron Courville

We propose an approach to learn spatio-temporal features in videos from intermediate visual representations we call "percepts" using Gated-Recurrent-Unit Recurrent Networks (GRUs). Our method relies on percepts that are extracted from all level of a deep convolutional network trained on the large ImageNet dataset.

1 code implementation • 19 Nov 2015 • Dzmitry Bahdanau, Dmitriy Serdyuk, Philémon Brakel, Nan Rosemary Ke, Jan Chorowski, Aaron Courville, Yoshua Bengio

Our idea is that this score can be interpreted as an estimate of the task loss, and that the estimation error may be used as a consistent surrogate loss.

7 code implementations • 17 Jul 2015 • Iulian V. Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, Joelle Pineau

We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models.

no code implementations • 4 Jul 2015 • Kyunghyun Cho, Aaron Courville, Yoshua Bengio

Whereas deep neural networks were first mostly used for classification tasks, they are rapidly expanding in the realm of structured output problems, where the observed target is composed of multiple random variables that have a rich joint distribution, given the input.

5 code implementations • NeurIPS 2015 • Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron Courville, Yoshua Bengio

In this paper, we explore the inclusion of latent random variables into the dynamic hidden state of a recurrent neural network (RNN) by combining elements of the variational autoencoder.

13 code implementations • 13 May 2015 • Mohammad Havaei, Axel Davy, David Warde-Farley, Antoine Biard, Aaron Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, Hugo Larochelle

Finally, we explore a cascade architecture in which the output of a basic CNN is treated as an additional source of information for a subsequent CNN.

Ranked #1 on Brain Tumor Segmentation on BRATS-2013 leaderboard

3 code implementations • 3 May 2015 • Francesco Visin, Kyle Kastner, Kyunghyun Cho, Matteo Matteucci, Aaron Courville, Yoshua Bengio

In this paper, we propose a deep neural network architecture for object recognition based on recurrent neural networks.

Ranked #34 on Image Classification on SVHN

no code implementations • 5 Mar 2015 • Samira Ebrahimi Kahou, Xavier Bouthillier, Pascal Lamblin, Caglar Gulcehre, Vincent Michalski, Kishore Konda, Sébastien Jean, Pierre Froumenty, Yann Dauphin, Nicolas Boulanger-Lewandowski, Raul Chandias Ferrari, Mehdi Mirza, David Warde-Farley, Aaron Courville, Pascal Vincent, Roland Memisevic, Christopher Pal, Yoshua Bengio

The task of the emotion recognition in the wild (EmotiW) Challenge is to assign one of seven emotions to short video clips extracted from Hollywood style movies.

1 code implementation • 3 Mar 2015 • Atousa Torabi, Christopher Pal, Hugo Larochelle, Aaron Courville

DVS is an audio narration describing the visual elements and actions in a movie for the visually impaired.

5 code implementations • ICCV 2015 • Li Yao, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo Larochelle, Aaron Courville

In this context, we propose an approach that successfully takes into account both the local and global temporal structure of videos to produce descriptions.

76 code implementations • 10 Feb 2015 • Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

1 code implementation • NeurIPS 2014 • Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio

We propose a new framework for estimating generative models via adversarial nets, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake.

no code implementations • 1 Oct 2014 • Guillaume Desjardins, Heng Luo, Aaron Courville, Yoshua Bengio

Restricted Boltzmann Machines (RBMs) are one of the fundamental building blocks of deep learning.

175 code implementations • Proceedings of the 27th International Conference on Neural Information Processing Systems 2014 • Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake.

Super-Resolution Time-Series Few-Shot Learning with Heterogeneous Channels

no code implementations • 21 Dec 2013 • David Warde-Farley, Ian J. Goodfellow, Aaron Courville, Yoshua Bengio

The recently introduced dropout training criterion for neural networks has been the subject of much attention due to its simplicity and remarkable effectiveness as a regularizer, as well as its interpretation as a training procedure for an exponentially large ensemble of networks that share parameters.

1 code implementation • 21 Dec 2013 • Ian J. Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, Yoshua Bengio

Catastrophic forgetting is a problem faced by many machine learning models and algorithms.

no code implementations • 18 Dec 2013 • Vincent Dumoulin, Ian J. Goodfellow, Aaron Courville, Yoshua Bengio

Restricted Boltzmann machines (RBMs) are powerful machine learning models, but learning and some kinds of inference in the model require sampling-based approximations, which, in classical digital computers, are implemented using expensive MCMC.