Search Results for author: Aaron Courville

Found 209 papers, 122 papers with code

Unsupervised Dependency Graph Network

1 code implementation ACL 2022 Yikang Shen, Shawn Tan, Alessandro Sordoni, Peng Li, Jie zhou, Aaron Courville

We introduce a new model, the Unsupervised Dependency Graph Network (UDGN), that can induce dependency structures from raw corpora and the masked language modeling task.

Language Modelling Masked Language Modeling +3

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

1 code implementation23 Oct 2024 Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux, Arian Hosseini, Rishabh Agarwal, Aaron Courville

However, asynchronous training relies on an underexplored regime, online but off-policy RLHF: learning on samples from previous iterations of our model.

Instruction Following Language Modelling +1

Stick-breaking Attention

1 code implementation23 Oct 2024 Shawn Tan, Yikang Shen, Songlin Yang, Aaron Courville, Rameswar Panda

We propose an alternative attention mechanism based on the stick-breaking process: For each token before the current, we determine a break point $\beta_{i, j}$, which represents the proportion of the remaining stick to allocate to the current token.

Neuroplastic Expansion in Deep Reinforcement Learning

no code implementations10 Oct 2024 Jiashun Liu, Johan Obando-Ceron, Aaron Courville, Ling Pan

The loss of plasticity in learning agents, analogous to the solidification of neural pathways in biological brains, significantly impedes learning and adaptation in reinforcement learning due to its non-stationary nature.

Deep Reinforcement Learning reinforcement-learning

VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

1 code implementation2 Oct 2024 Amirhossein Kazemnejad, Milad Aghajohari, Eva Portelance, Alessandro Sordoni, Siva Reddy, Aaron Courville, Nicolas Le Roux

In this work, we systematically evaluate the efficacy of value networks and reveal their significant shortcomings in reasoning-heavy LLM tasks, showing that they barely outperform a random baseline when comparing alternative steps.

GSM8K Math +1

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

no code implementations2 Oct 2024 Ghada Sokar, Johan Obando-Ceron, Aaron Courville, Hugo Larochelle, Pablo Samuel Castro

The use of deep neural networks in reinforcement learning (RL) often suffers from performance degradation as model size increases.

Reinforcement Learning (RL)

Managing multiple agents by automatically adjusting incentives

no code implementations3 Sep 2024 Shunichi Akatsuka, Yaemi Teramoto, Aaron Courville

In the coming years, AI agents will be used for making more complex decisions, including in situations involving many different groups of people.

AI Agent Management

SAFT: Towards Out-of-Distribution Generalization in Fine-Tuning

no code implementations3 Jul 2024 Bac Nguyen, Stefan Uhlich, Fabien Cardinaux, Lukas Mauch, Marzieh Edraki, Aaron Courville

While a pre-trained vision-language model like CLIP has demonstrated remarkable zero-shot performance, further adaptation of the model to downstream tasks leads to undesirable degradation for OOD data.

Few-Shot Learning General Knowledge +2

GenRL: Multimodal-foundation world models for generalization in embodied agents

1 code implementation26 Jun 2024 Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt, Aaron Courville, Sai Rajeswar

In this work, we overcome these problems by presenting multimodal-foundation world models, able to connect and align the representation of foundation VLMs with the latent space of generative world models for RL, without any language annotations.

Benchmarking Reinforcement Learning (RL)

On the consistency of hyper-parameter selection in value-based deep reinforcement learning

1 code implementation25 Jun 2024 Johan Obando-Ceron, João G. M. Araújo, Aaron Courville, Pablo Samuel Castro

This paper conducts an extensive empirical study focusing on the reliability of hyper-parameter selection for value-based deep reinforcement learning agents, including the introduction of a new score to quantify the consistency and reliability of various hyper-parameters.

Deep Reinforcement Learning reinforcement-learning

Advantage Alignment Algorithms

no code implementations20 Jun 2024 Juan Agustin Duque, Milad Aghajohari, Tim Cooijmans, Razvan Ciuca, Tianyu Zhang, Gauthier Gidel, Aaron Courville

In this work, we introduce Advantage Alignment, a family of algorithms derived from first principles that perform opponent shaping efficiently and intuitively.

Autonomous Vehicles Decision Making +2

The Curse of Diversity in Ensemble-Based Exploration

2 code implementations7 May 2024 Zhixuan Lin, Pierluca D'Oro, Evgenii Nikishin, Aaron Courville

We uncover a surprising phenomenon in deep reinforcement learning: training a diverse ensemble of data-sharing agents -- a well-established exploration strategy -- can significantly impair the performance of the individual ensemble members when compared to standard single-agent training.

Attribute continuous-control +4

LOQA: Learning with Opponent Q-Learning Awareness

no code implementations2 May 2024 Milad Aghajohari, Juan Agustin Duque, Tim Cooijmans, Aaron Courville

In various real-world scenarios, interactions among agents often resemble the dynamics of general-sum games, where each agent strives to optimize its own utility.

Q-Learning

Modeling Caption Diversity in Contrastive Vision-Language Pretraining

no code implementations30 Apr 2024 Samuel Lavoie, Polina Kirichenko, Mark Ibrahim, Mahmoud Assran, Andrew Gordon Wilson, Aaron Courville, Nicolas Ballas

Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image.

Diversity Zero-Shot Learning

SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision

1 code implementation24 Apr 2024 Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville

Using SPARO, we demonstrate improvements on downstream recognition, robustness, retrieval, and compositionality benchmarks with CLIP (up to +14% for ImageNet, +4% for SugarCrepe), and on nearest neighbors and linear probe for ImageNet with DINO (+3% each).

Inductive Bias Representation Learning

Best Response Shaping

no code implementations5 Apr 2024 Milad Aghajohari, Tim Cooijmans, Juan Agustin Duque, Shunichi Akatsuka, Aaron Courville

We investigate the challenge of multi-agent deep reinforcement learning in partially competitive environments, where traditional methods struggle to foster reciprocity-based cooperation.

Deep Reinforcement Learning Question Answering

Scattered Mixture-of-Experts Implementation

2 code implementations13 Mar 2024 Shawn Tan, Yikang Shen, Rameswar Panda, Aaron Courville

We present ScatterMoE, an implementation of Sparse Mixture-of-Experts (SMoE) on GPUs.

In value-based deep reinforcement learning, a pruned network is a good network

no code implementations19 Feb 2024 Johan Obando-Ceron, Aaron Courville, Pablo Samuel Castro

Recent work has shown that deep reinforcement learning agents have difficulty in effectively using their network parameters.

Deep Reinforcement Learning reinforcement-learning

V-STaR: Training Verifiers for Self-Taught Reasoners

no code implementations9 Feb 2024 Arian Hosseini, Xingdi Yuan, Nikolay Malkin, Aaron Courville, Alessandro Sordoni, Rishabh Agarwal

Common self-improvement approaches for large language models (LLMs), such as STaR, iteratively fine-tune LLMs on self-generated solutions to improve their problem-solving ability.

Code Generation Math

Language Model Alignment with Elastic Reset

1 code implementation NeurIPS 2023 Michael Noukhovitch, Samuel Lavoie, Florian Strub, Aaron Courville

We periodically reset the online model to an exponentially moving average (EMA) of itself, then reset the EMA model to the initial model.

Chatbot Language Modelling +1

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

1 code implementation21 Nov 2023 Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM).

Sparse Universal Transformer

2 code implementations11 Oct 2023 Shawn Tan, Yikang Shen, Zhenfang Chen, Aaron Courville, Chuang Gan

The Universal Transformer (UT) is a variant of the Transformer that shares parameters across its layers.

Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

2 code implementations4 Oct 2023 Dinghuai Zhang, Ricky T. Q. Chen, Cheng-Hao Liu, Aaron Courville, Yoshua Bengio

We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics.

Meta-Value Learning: a General Framework for Learning with Learning Awareness

1 code implementation17 Jul 2023 Tim Cooijmans, Milad Aghajohari, Aaron Courville

Gradient-based learning in multi-agent systems is difficult because the gradient derives from a first-order model which does not account for the interaction between agents' learning processes.

Q-Learning

Bigger, Better, Faster: Human-level Atari with human-level efficiency

3 code implementations30 May 2023 Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro

We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark.

Atari Games 100k

Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets

1 code implementation26 May 2023 Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan

In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space.

Combinatorial Optimization

Distributional GFlowNets with Quantile Flows

1 code implementation11 Feb 2023 Dinghuai Zhang, Ling Pan, Ricky T. Q. Chen, Aaron Courville, Yoshua Bengio

Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating complex combinatorial structure through a series of decision-making steps.

Decision Making

Versatile Energy-Based Probabilistic Models for High Energy Physics

1 code implementation NeurIPS 2023 Taoli Cheng, Aaron Courville

As a classical generative modeling approach, energy-based models have the natural advantage of flexibility in the form of the energy function.

Teaching Algorithmic Reasoning via In-context Learning

no code implementations15 Nov 2022 Hattie Zhou, Azade Nova, Hugo Larochelle, Aaron Courville, Behnam Neyshabur, Hanie Sedghi

Large language models (LLMs) have shown increasing in-context learning capabilities through scaling up model and data size.

In-Context Learning

On the Compositional Generalization Gap of In-Context Learning

no code implementations15 Nov 2022 Arian Hosseini, Ankit Vani, Dzmitry Bahdanau, Alessandro Sordoni, Aaron Courville

In this work, we look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.

In-Context Learning Semantic Parsing

Generative Augmented Flow Networks

no code implementations7 Oct 2022 Ling Pan, Dinghuai Zhang, Aaron Courville, Longbo Huang, Yoshua Bengio

We specify intermediate rewards by intrinsic motivation to tackle the exploration problem in sparse reward environments.

Diversity

Latent State Marginalization as a Low-cost Approach for Improving Exploration

1 code implementation3 Oct 2022 Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen

While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity.

continuous-control Continuous Control +2

Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels

1 code implementation24 Sep 2022 Sai Rajeswar, Pietro Mazzaglia, Tim Verbelen, Alexandre Piché, Bart Dhoedt, Aaron Courville, Alexandre Lacoste

In this work, we study the URLB and propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent, and a task-aware fine-tuning strategy combined with a new proposed hybrid planner, Dyna-MPC, to adapt the agent for downstream tasks.

reinforcement-learning Reinforcement Learning +2

Riemannian Diffusion Models

no code implementations16 Aug 2022 Chin-wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, Aaron Courville

In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation.

Image Generation

R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS

no code implementations30 Jun 2022 Kyle Kastner, Aaron Courville

This paper introduces R-MelNet, a two-part autoregressive architecture with a frontend based on the first tier of MelNet and a backend WaveRNN-style audio decoder for neural text-to-speech synthesis.

Decoder Speech Synthesis +2

Building Robust Ensembles via Margin Boosting

1 code implementation7 Jun 2022 Dinghuai Zhang, Hongyang Zhang, Aaron Courville, Yoshua Bengio, Pradeep Ravikumar, Arun Sai Suggala

Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks.

Adversarial Robustness

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

1 code implementation3 Jun 2022 Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

To address these issues, we present reincarnating RL as an alternative workflow or class of problem settings, where prior computational work (e. g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another.

Atari Games Humanoid Control +3

Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods

no code implementations2 Jun 2022 Yuchen Lu, Zhen Liu, Aristide Baratin, Romain Laroche, Aaron Courville, Alessandro Sordoni

We address the problem of evaluating the quality of self-supervised learning (SSL) models without access to supervised labels, while being agnostic to the architecture, learning algorithm or data manipulation used during training.

Domain Generalization Self-Supervised Learning

The Primacy Bias in Deep Reinforcement Learning

1 code implementation16 May 2022 Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville

This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later.

Atari Games 100k Deep Reinforcement Learning +2

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

1 code implementation1 Apr 2022 Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Ankit Vani, Michael Noukhovitch, Kenji Kawaguchi, Aaron Courville

Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation.

Classification Inductive Bias +1

Generative Flow Networks for Discrete Probabilistic Modeling

2 code implementations3 Feb 2022 Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Volokhova, Aaron Courville, Yoshua Bengio

We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data.

Invariant Representation Driven Neural Classifier for Anti-QCD Jet Tagging

no code implementations18 Jan 2022 Taoli Cheng, Aaron Courville

We leverage representation learning and the inductive bias in neural-net-based Standard Model jet classification tasks, to detect non-QCD signal jets.

Anomaly Detection Inductive Bias +2

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

no code implementations ICLR 2022 Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine

In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations.

Atari Games D4RL +4

Chunked Autoregressive GAN for Conditional Waveform Synthesis

1 code implementation ICLR 2022 Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio

We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression.

Inductive Bias

Unifying Likelihood-free Inference with Black-box Optimization and Beyond

no code implementations ICLR 2022 Dinghuai Zhang, Jie Fu, Yoshua Bengio, Aaron Courville

Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on the pharmaceutical industry.

Drug Discovery

Learning to Dequantise with Truncated Flows

no code implementations ICLR 2022 Shawn Tan, Chin-wei Huang, Alessandro Sordoni, Aaron Courville

Addtionally, since the support of the marginal $q(z)$ is bounded and the support of prior $p(z)$ is not, we propose renormalising the prior distribution over the support of $q(z)$.

Variational Inference

Learnability and Expressiveness in Self-Supervised Learning

no code implementations29 Sep 2021 Yuchen Lu, Zhen Liu, Alessandro Sordoni, Aristide Baratin, Romain Laroche, Aaron Courville

In this work, we argue that representations induced by self-supervised learning (SSL) methods should both be expressive and learnable.

Data Augmentation Self-Supervised Learning

Overcoming Label Ambiguity with Multi-label Iterated Learning

no code implementations29 Sep 2021 Sai Rajeswar Mudumba, Pau Rodriguez, Soumye Singhal, David Vazquez, Aaron Courville

This ambiguity biases models towards a single prediction, which could result in the suppression of classes that tend to co-occur in the data.

Multi-Label Learning Transfer Learning

INFERNO: Inferring Object-Centric 3D Scene Representations without Supervision

no code implementations29 Sep 2021 Lluis Castrejon, Nicolas Ballas, Aaron Courville

Each object representation defines a localized neural radiance field that is used to generate 2D views of the scene through a differentiable rendering process.

Object Video Object Tracking +1

Inducing Reusable Skills From Demonstrations with Option-Controller Network

no code implementations29 Sep 2021 Siyuan Zhou, Yikang Shen, Yuchen Lu, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

With the isolation of information and the synchronous calling mechanism, we can impose a division of works between the controller and options in an end-to-end training regime.

On Bonus-Based Exploration Methods in the Arcade Learning Environment

no code implementations22 Sep 2021 Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

Montezuma's Revenge

Deep Reinforcement Learning at the Edge of the Statistical Precipice

3 code implementations NeurIPS 2021 Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs.

Deep Reinforcement Learning reinforcement-learning +1

A Variational Perspective on Diffusion-Based Generative Models and Score Matching

1 code implementation NeurIPS 2021 Chin-wei Huang, Jae Hyun Lim, Aaron Courville

Under this framework, we show that minimizing the score-matching loss is equivalent to maximizing a lower bound of the likelihood of the plug-in reverse SDE proposed by Song et al. (2021), bridging the theoretical gap.

Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

no code implementations5 Jun 2021 Dinghuai Zhang, Kartik Ahuja, Yilun Xu, Yisen Wang, Aaron Courville

Can models with particular structure avoid being biased towards spurious correlation in out-of-distribution (OOD) generalization?

Out-of-Distribution Generalization

Hierarchical Video Generation for Complex Data

no code implementations4 Jun 2021 Lluis Castrejon, Nicolas Ballas, Aaron Courville

Inspired by this we propose a hierarchical model for video generation which follows a coarse to fine approach.

Video Generation

Understanding by Understanding Not: Modeling Negation in Language Models

1 code implementation NAACL 2021 Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, R Devon Hjelm, Alessandro Sordoni, Aaron Courville

To improve language models in this regard, we propose to augment the language modeling objective with an unlikelihood objective that is based on negated generic sentences from a raw text corpus.

Language Modelling Negation

Iterated learning for emergent systematicity in VQA

no code implementations ICLR 2021 Ankit Vani, Max Schwarzer, Yuchen Lu, Eeshan Dhekane, Aaron Courville

Although neural module networks have an architectural bias towards compositionality, they require gold standard layouts to generalize systematically in practice.

Question Answering Systematic Generalization +1

Touch-based Curiosity for Sparse-Reward Tasks

1 code implementation1 Apr 2021 Sai Rajeswar, Cyril Ibrahim, Nitin Surya, Florian Golemo, David Vazquez, Aaron Courville, Pedro O. Pinheiro

Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks that involve contact-rich motion.

Learning Task Decomposition with Ordered Memory Policy Network

no code implementations19 Mar 2021 Yuchen Lu, Yikang Shen, Siyuan Zhou, Aaron Courville, Joshua B. Tenenbaum, Chuang Gan

The discovered subtask hierarchy could be used to perform task decomposition, recovering the subtask boundaries in an unstruc-tured demonstration.

Inductive Bias

Emergent Communication under Competition

1 code implementation25 Jan 2021 Michael Noukhovitch, Travis LaCroix, Angeliki Lazaridou, Aaron Courville

First, we show that communication is proportional to cooperation, and it can occur for partially competitive scenarios using standard learning algorithms.

Misconceptions

SSW-GAN: Scalable Stage-wise Training of Video GANs

no code implementations1 Jan 2021 Lluis Castrejon, Nicolas Ballas, Aaron Courville

Current state-of-the-art generative models for videos have high computational requirements that impede high resolution generations beyond a few frames.

Neural Approximate Sufficient Statistics for Likelihood-free Inference

no code implementations ICLR 2021 Yanzhi Chen, Dinghuai Zhang, Michael U. Gutmann, Aaron Courville, Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for likelihood-free inference where the evaluation of likelihood function is intractable but sampling / simulating data from the model is possible.

Systematic generalisation with group invariant predictions

no code implementations ICLR 2021 Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville

We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features.

StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling

2 code implementations ACL 2021 Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville

There are two major classes of natural language grammar -- the dependency grammar that models one-to-one correspondences between words and the constituency grammar that models the assembly of one or several corresponded words.

Constituency Parsing Language Modelling +2

Bijective-Contrastive Estimation

no code implementations pproximateinference AABI Symposium 2021 Jae Hyun Lim, Chin-wei Huang, Aaron Courville, Christopher Pal

In this work, we propose Bijective-Contrastive Estimation (BCE), a classification-based learning criterion for energy-based models.

Classification

NU-GAN: High resolution neural upsampling with GAN

no code implementations22 Oct 2020 Rithesh Kumar, Kundan Kumar, Vicki Anand, Yoshua Bengio, Aaron Courville

In this paper, we propose NU-GAN, a new method for resampling audio from lower to higher sampling rates (upsampling).

Audio Generation Speech Synthesis +2

Neural Approximate Sufficient Statistics for Implicit Models

1 code implementation20 Oct 2020 Yanzhi Chen, Dinghuai Zhang, Michael Gutmann, Aaron Courville, Zhanxing Zhu

We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation of the likelihood function is intractable, but sampling data from the model is possible.

Integrating Categorical Semantics into Unsupervised Domain Translation

1 code implementation ICLR 2021 Samuel Lavoie, Faruk Ahmed, Aaron Courville

While unsupervised domain translation (UDT) has seen a lot of success recently, we argue that mediating its translation via categorical semantic features could broaden its applicability.

Object Translation

Data-Efficient Reinforcement Learning with Self-Predictive Representations

1 code implementation ICLR 2021 Max Schwarzer, Ankesh Anand, Rishab Goel, R. Devon Hjelm, Aaron Courville, Philip Bachman

We further improve performance by adding data augmentation to the future prediction loss, which forces the agent's representations to be consistent across multiple views of an observation.

Atari Games 100k Data Augmentation +6

AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation

2 code implementations ICML 2020 Jae Hyun Lim, Aaron Courville, Christopher Pal, Chin-wei Huang

Entropy is ubiquitous in machine learning, but it is in general intractable to compute the entropy of the distribution of an arbitrary continuous random variable.

continuous-control Continuous Control +2

Graph Density-Aware Losses for Novel Compositions in Scene Graph Generation

1 code implementation17 May 2020 Boris Knyazev, Harm de Vries, Cătălina Cangea, Graham W. Taylor, Aaron Courville, Eugene Belilovsky

We show that such models can suffer the most in their ability to generalize to rare compositions, evaluating two different models on the Visual Genome dataset and its more recent, improved version, GQA.

Graph Generation Scene Graph Generation

Countering Language Drift with Seeded Iterated Learning

no code implementations ICML 2020 Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville

At each time step, the teacher is created by copying the student agent, before being finetuned to maximize task completion.

Translation

Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images using a View-based Representation

1 code implementation23 Mar 2020 Sai Rajeswar, Fahim Mannan, Florian Golemo, Jérôme Parent-Lévesque, David Vazquez, Derek Nowrouzezahrai, Aaron Courville

We propose Pix2Shape, an approach to solve this problem with four components: (i) an encoder that infers the latent 3D representation from an image, (ii) a decoder that generates an explicit 2. 5D surfel-based reconstruction of a scene from the latent code (iii) a differentiable renderer that synthesizes a 2D image from the surfel representation, and (iv) a critic network trained to discriminate between images generated by the decoder-renderer and those from a training distribution.

Decoder Spatial Reasoning

Solving ODE with Universal Flows: Approximation Theory for Flow-Based Models

no code implementations ICLR Workshop DeepDiffEq 2019 Chin-wei Huang, Laurent Dinh, Aaron Courville

Normalizing flows are powerful invertible probabilistic models that can be used to translate two probability distributions, in a way that allows us to efficiently track the change of probability density.

Computational Efficiency

Augmented Normalizing Flows: Bridging the Gap Between Generative Flows and Latent Variable Models

1 code implementation17 Feb 2020 Chin-wei Huang, Laurent Dinh, Aaron Courville

In this work, we propose a new family of generative flows on an augmented data space, with an aim to improve expressivity without drastically increasing the computational cost of sampling and evaluation of a lower bound on the likelihood.

Image Generation

On Bonus Based Exploration Methods In The Arcade Learning Environment

no code implementations ICLR 2020 Adrien Ali Taiga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

Montezuma's Revenge Reinforcement Learning

CLOSURE: Assessing Systematic Generalization of CLEVR Models

3 code implementations12 Dec 2019 Dzmitry Bahdanau, Harm de Vries, Timothy J. O'Donnell, Shikhar Murty, Philippe Beaudoin, Yoshua Bengio, Aaron Courville

In this work, we study how systematic the generalization of such models is, that is to which extent they are capable of handling novel combinations of known linguistic constructs.

Few-Shot Learning Systematic Generalization +1

What Do Compressed Deep Neural Networks Forget?

2 code implementations13 Nov 2019 Sara Hooker, Aaron Courville, Gregory Clark, Yann Dauphin, Andrea Frome

However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.

Fairness Interpretability Techniques for Deep Learning +4

Ordered Memory

1 code implementation NeurIPS 2019 Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron Courville

Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.

ListOps

Icentia11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery

1 code implementation21 Oct 2019 Shawn Tan, Guillaume Androz, Ahmad Chamseddine, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen

We release the largest public ECG dataset of continuous raw signals for representation learning containing 11 thousand patients and 2 billion labelled beats.

Clustering Representation Learning

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

21 code implementations NeurIPS 2019 Kundan Kumar, Rithesh Kumar, Thibault de Boissiere, Lucas Gestin, Wei Zhen Teoh, Jose Sotelo, Alexandre de Brebisson, Yoshua Bengio, Aaron Courville

In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques.

Speech Synthesis Translation

{COMPANYNAME}11K: An Unsupervised Representation Learning Dataset for Arrhythmia Subtype Discovery

no code implementations25 Sep 2019 Shawn Tan, Guillaume Androz, Ahmad Chamseddine, Pierre Fecteau, Aaron Courville, Yoshua Bengio, Joseph Paul Cohen

We release the largest public ECG dataset of continuous raw signals for representation learning containing over 11k patients and 2 billion labelled beats.

Clustering Representation Learning

Selfish Emergent Communication

no code implementations25 Sep 2019 Michael Noukhovitch, Travis LaCroix, Aaron Courville

Current literature in machine learning holds that unaligned, self-interested agents do not learn to use an emergent communication channel.

Selective Brain Damage: Measuring the Disparate Impact of Model Pruning

no code implementations25 Sep 2019 Sara Hooker, Yann Dauphin, Aaron Courville, Andrea Frome

Neural network pruning techniques have demonstrated it is possible to remove the majority of weights in a network with surprisingly little degradation to top-1 test set accuracy.

Network Pruning

VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering

1 code implementation14 Aug 2019 Cătălina Cangea, Eugene Belilovsky, Pietro Liò, Aaron Courville

The goal of this dataset is to assess question-answering performance from nearly-ideal navigation paths, while considering a much more complete variety of questions than current instantiations of the EQA task.

Embodied Question Answering Question Answering +2

Detecting semantic anomalies

1 code implementation13 Aug 2019 Faruk Ahmed, Aaron Courville

We critically appraise the recent interest in out-of-distribution (OOD) detection and question the practical relevance of existing benchmarks.

Anomaly Detection Multi-Task Learning +2

Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment

no code implementations6 Aug 2019 Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE).

Benchmarking Montezuma's Revenge +1

Adversarial Computation of Optimal Transport Maps

1 code implementation24 Jun 2019 Jacob Leygonie, Jennifer She, Amjad Almahairi, Sai Rajeswar, Aaron Courville

We show that during training, our generator follows the $W_2$-geodesic between the initial and the target distributions.

Investigating Biases in Textual Entailment Datasets

no code implementations23 Jun 2019 Shawn Tan, Yikang Shen, Chin-wei Huang, Aaron Courville

The ability to understand logical relationships between sentences is an important task in language understanding.

BIG-bench Machine Learning Natural Language Inference +2

Stochastic Neural Network with Kronecker Flow

no code implementations10 Jun 2019 Chin-wei Huang, Ahmed Touati, Pascal Vincent, Gintare Karolina Dziugaite, Alexandre Lacoste, Aaron Courville

Recent advances in variational inference enable the modelling of highly structured joint distributions, but are limited in their capacity to scale to the high-dimensional setting of stochastic neural networks.

Thompson Sampling Variational Inference

Note on the bias and variance of variational inference

1 code implementation9 Jun 2019 Chin-wei Huang, Aaron Courville

In this note, we study the relationship between the variational gap and the variance of the (log) likelihood ratio.

Variational Inference

Batch weight for domain adaptation with mass shift

no code implementations29 May 2019 Mikołaj Bińkowski, R. Devon Hjelm, Aaron Courville

We also provide rigorous probabilistic setting for domain transfer and new simplified objective for training transfer networks, an alternative to complex, multi-component loss functions used in the current state-of-the art image-to-image translation models.

Domain Adaptation Image-to-Image Translation +1

Hierarchical Importance Weighted Autoencoders

1 code implementation13 May 2019 Chin-wei Huang, Kris Sankaran, Eeshan Dhekane, Alexandre Lacoste, Aaron Courville

We believe a joint proposal has the potential of reducing the number of redundant samples, and introduce a hierarchical structure to induce correlation.

Variational Inference

Unsupervised one-to-many image translation

no code implementations ICLR 2019 Samuel Lavoie-Marchildon, Sebastien Lachapelle, Mikołaj Bińkowski, Aaron Courville, Yoshua Bengio, R. Devon Hjelm

We perform completely unsupervised one-sided image to image translation between a source domain $X$ and a target domain $Y$ such that we preserve relevant underlying shared semantics (e. g., class, size, shape, etc).

Translation Unsupervised Image-To-Image Translation

Manifold Mixup: Learning Better Representations by Interpolating Hidden States

1 code implementation ICLR 2019 Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio

Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes.

Pix2Scene: Learning Implicit 3D Representations from Images

no code implementations ICLR 2019 Sai Rajeswar, Fahim Mannan, Florian Golemo, David Vazquez, Derek Nowrouzezahrai, Aaron Courville

Modelling 3D scenes from 2D images is a long-standing problem in computer vision with implications in, e. g., simulation and robotics.

Spatial Reasoning

EnGAN: Latent Space MCMC and Maximum Entropy Generators for Energy-based Models

no code implementations ICLR 2019 Rithesh Kumar, Anirudh Goyal, Aaron Courville, Yoshua Bengio

Unsupervised learning is about capturing dependencies between variables and is driven by the contrast between the probable vs improbable configurations of these variables, often either via a generative model which only samples probable ones or with an energy function (unnormalized log-density) which is low for probable ones and high for improbable ones.

Anomaly Detection Novelty Detection

Improved Conditional VRNNs for Video Prediction

1 code implementation ICCV 2019 Lluis Castrejon, Nicolas Ballas, Aaron Courville

To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models.

Video Generation Video Prediction

Counterpoint by Convolution

4 code implementations18 Mar 2019 Cheng-Zhi Anna Huang, Tim Cooijmans, Adam Roberts, Aaron Courville, Douglas Eck

Machine learning models of music typically break up the task of composition into a chronological process, composing a piece of music in a single pass from beginning to end.

Music Generation Music Modeling

Maximum Entropy Generators for Energy-Based Models

2 code implementations24 Jan 2019 Rithesh Kumar, Sherjil Ozair, Anirudh Goyal, Aaron Courville, Yoshua Bengio

Maximum likelihood estimation of energy-based models is a challenging problem due to the intractability of the log-likelihood gradient.

Anomaly Detection

Deep Generative Modeling of LiDAR Data

1 code implementation4 Dec 2018 Lucas Caccia, Herke van Hoof, Aaron Courville, Joelle Pineau

In this work, we show that one can adapt deep generative models for this task by unravelling lidar scans into a 2D point map.

Point Cloud Generation

Systematic Generalization: What Is Required and Can It Be Learned?

2 code implementations ICLR 2019 Dzmitry Bahdanau, Shikhar Murty, Michael Noukhovitch, Thien Huu Nguyen, Harm de Vries, Aaron Courville

Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantiated.

Systematic Generalization Visual Question Answering (VQA)

Planning in Dynamic Environments with Conditional Autoregressive Models

1 code implementation25 Nov 2018 Johanna Hansen, Kyle Kastner, Aaron Courville, Gregory Dudek

We demonstrate the use of conditional autoregressive generative models (van den Oord et al., 2016a) over a discrete latent space (van den Oord et al., 2017b) for forward planning with MCTS.

Harmonic Recomposition using Conditional Autoregressive Modeling

1 code implementation18 Nov 2018 Kyle Kastner, Rithesh Kumar, Tim Cooijmans, Aaron Courville

We demonstrate a conditional autoregressive pipeline for efficient music recomposition, based on methods presented in van den Oord et al.(2017).

Representation Mixing for TTS Synthesis

no code implementations17 Nov 2018 Kyle Kastner, João Felipe Santos, Yoshua Bengio, Aaron Courville

Recent character and phoneme-based parametric TTS systems using deep learning have shown strong performance in natural speech generation.

W2GAN: RECOVERING AN OPTIMAL TRANSPORT MAP WITH A GAN

no code implementations27 Sep 2018 Leygonie Jacob*, Jennifer She*, Amjad Almahairi, Sai Rajeswar, Aaron Courville

In this work we address the converse question: is it possible to recover an optimal map in a GAN fashion?

On Difficulties of Probability Distillation

no code implementations27 Sep 2018 Chin-wei Huang, Faruk Ahmed, Kundan Kumar, Alexandre Lacoste, Aaron Courville

Probability distillation has recently been of interest to deep learning practitioners as it presents a practical solution for sampling from autoregressive models for deployment in real-time applications.

Convergence Properties of Deep Neural Networks on Separable Data

no code implementations27 Sep 2018 Remi Tachet des Combes, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio

While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.

Binary Classification

On the Learning Dynamics of Deep Neural Networks

no code implementations18 Sep 2018 Remi Tachet, Mohammad Pezeshki, Samira Shabanian, Aaron Courville, Yoshua Bengio

While a lot of progress has been made in recent years, the dynamics of learning in deep nonlinear neural networks remain to this day largely misunderstood.

Binary Classification General Classification

Improving Explorability in Variational Inference with Annealed Variational Objectives

1 code implementation NeurIPS 2018 Chin-wei Huang, Shawn Tan, Alexandre Lacoste, Aaron Courville

Despite the advances in the representational capacity of approximate distributions for variational inference, the optimization process can still limit the density that is ultimately learned.

Variational Inference

Approximate Exploration through State Abstraction

no code implementations29 Aug 2018 Adrien Ali Taïga, Aaron Courville, Marc G. Bellemare

Next, we show how a given density model can be related to an abstraction and that the corresponding pseudo-count bonus can act as a substitute in MBIE-EB combined with this abstraction, but may lead to either under- or over-exploration.

Reinforcement Learning

Visual Reasoning with Multi-hop Feature Modulation

1 code implementation ECCV 2018 Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

Question Answering Visual Dialog +2

Mutual Information Neural Estimation

no code implementations ICML 2018 Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeshwar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, Devon Hjelm

We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks.

General Classification

On the Spectral Bias of Neural Networks

2 code implementations ICLR 2019 Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred A. Hamprecht, Yoshua Bengio, Aaron Courville

Neural networks are known to be a class of highly expressive functions able to fit even random input-output mappings with $100\%$ accuracy.

Learning Distributed Representations from Reviews for Collaborative Filtering

no code implementations18 Jun 2018 Amjad Almahairi, Kyle Kastner, Kyunghyun Cho, Aaron Courville

However, interestingly, the greater modeling power offered by the recurrent neural network appears to undermine the model's ability to act as a regularizer of the product representations.

Collaborative Filtering Recommendation Systems

Manifold Mixup: Better Representations by Interpolating Hidden States

12 code implementations ICLR 2019 Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples.

Image Classification

Neural Autoregressive Flows

6 code implementations ICML 2018 Chin-wei Huang, David Krueger, Alexandre Lacoste, Aaron Courville

Normalizing flows and autoregressive models have been successfully combined to produce state-of-the-art results in density estimation, via Masked Autoregressive Flows (MAF), and to accelerate state-of-the-art WaveNet-based speech synthesis to 20x faster than real-time, via Inverse Autoregressive Flows (IAF).

Density Estimation Speech Synthesis

Generating Contradictory, Neutral, and Entailing Sentences

no code implementations7 Mar 2018 Yikang Shen, Shawn Tan, Chin-wei Huang, Aaron Courville

Learning distributed sentence representations remains an interesting problem in the field of Natural Language Processing (NLP).

Diversity Natural Language Inference +2

Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data

3 code implementations ICML 2018 Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, Aaron Courville

Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by reducing the need for paired data.

Image Segmentation Semantic Segmentation +1

Hierarchical Adversarially Learned Inference

no code implementations ICLR 2018 Mohamed Ishmael Belghazi, Sai Rajeswar, Olivier Mastropietro, Negar Rostamzadeh, Jovana Mitrovic, Aaron Courville

We propose a novel hierarchical generative model with a simple Markovian structure and a corresponding inference model.

Attribute

MINE: Mutual Information Neural Estimation

22 code implementations12 Jan 2018 Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, R. Devon Hjelm

We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent over neural networks.

General Classification

Learning Generative Models with Locally Disentangled Latent Factors

no code implementations ICLR 2018 Brady Neal, Alex Lamb, Sherjil Ozair, Devon Hjelm, Aaron Courville, Yoshua Bengio, Ioannis Mitliagkas

One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler generation tasks.

GibbsNet: Iterative Adversarial Inference for Deep Graphical Models

no code implementations NeurIPS 2017 Alex Lamb, Devon Hjelm, Yaroslav Ganin, Joseph Paul Cohen, Aaron Courville, Yoshua Bengio

Directed latent variable models that formulate the joint distribution as $p(x, z) = p(z) p(x \mid z)$ have the advantage of fast and exact sampling.

Attribute

HoME: a Household Multimodal Environment

no code implementations29 Nov 2017 Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville

We introduce HoME: a Household Multimodal Environment for artificial agents to learn from vision, audio, semantics, physics, and interaction with objects and other agents, all within a realistic context.

OpenAI Gym reinforcement-learning +2

Neural Language Modeling by Jointly Learning Syntax and Lexicon

1 code implementation ICLR 2018 Yikang Shen, Zhouhan Lin, Chin-wei Huang, Aaron Courville

In this paper, We propose a novel neural language model, called the Parsing-Reading-Predict Networks (PRPN), that can simultaneously induce the syntactic structure from unannotated sentences and leverage the inferred structure to learn a better language model.

Constituency Grammar Induction Language Modelling

Learnable Explicit Density for Continuous Latent Space and Variational Inference

no code implementations6 Oct 2017 Chin-wei Huang, Ahmed Touati, Laurent Dinh, Michal Drozdzal, Mohammad Havaei, Laurent Charlin, Aaron Courville

In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its corresponding posterior.

Density Estimation Variational Inference

Self-organized Hierarchical Softmax

no code implementations26 Jul 2017 Yikang Shen, Shawn Tan, Chrisopher Pal, Aaron Courville

We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies.

Language Modelling Sentence +1

Learning Visual Reasoning Without Strong Priors

2 code implementations10 Jul 2017 Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville

Previous work has operated under the assumption that visual reasoning calls for a specialized architecture, but we show that a general architecture with proper conditioning can learn to visually reason effectively.

Visual Reasoning

Adversarial Generation of Natural Language

no code implementations WS 2017 Sai Rajeswar, Sandeep Subramanian, Francis Dutil, Christopher Pal, Aaron Courville

Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation.

Image Generation Language Modelling +1

End-to-end optimization of goal-driven and visually grounded dialogue systems

2 code implementations15 Mar 2017 Florian Strub, Harm de Vries, Jeremie Mary, Bilal Piot, Aaron Courville, Olivier Pietquin

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning.

Decoder Deep Reinforcement Learning +3

Calibrating Energy-based Generative Adversarial Networks

1 code implementation6 Feb 2017 Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard Hovy, Aaron Courville

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal.

Ranked #18 on Conditional Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks

1 code implementation10 Jan 2017 Ying Zhang, Mohammad Pezeshki, Philemon Brakel, Saizheng Zhang, Cesar Laurent Yoshua Bengio, Aaron Courville

Meanwhile, Connectionist Temporal Classification (CTC) with Recurrent Neural Networks (RNNs), which is proposed for labeling unsegmented sequences, makes it feasible to train an end-to-end speech recognition system instead of hybrid settings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1