Search Results for author: Maxime Chevalier-Boisvert

Found 8 papers, 6 papers with code

DeepDrummer : Generating Drum Loops using Deep Learning and a Human in the Loop

1 code implementation10 Aug 2020 Guillaume Alain, Maxime Chevalier-Boisvert, Frederic Osterrath, Remi Piche-Taillefer

DeepDrummer is a drum loop generation tool that uses active learning to learn the preferences (or current artistic intentions) of a human user from a small number of interactions.

Active Learning Efficient Exploration

BabyAI 1.1

3 code implementations24 Jul 2020 David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90. 4 %.

Computational Efficiency Imitation Learning +3

Combating False Negatives in Adversarial Imitation Learning

no code implementations2 Feb 2020 Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior.

Imitation Learning

Options of Interest: Temporal Abstraction with Interest Functions

3 code implementations1 Jan 2020 Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup

Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time.

Automated curriculum generation for Policy Gradients from Demonstrations

1 code implementation1 Dec 2019 Anirudh Srinivasan, Dzmitry Bahdanau, Maxime Chevalier-Boisvert, Yoshua Bengio

In this paper, we present a technique that improves the process of training an agent (using RL) for instruction following.

Instruction Following

Option-Critic in Cooperative Multi-agent Systems

1 code implementation28 Nov 2019 Jhelum Chakravorty, Nadeem Ward, Julien Roy, Maxime Chevalier-Boisvert, Sumana Basu, Andrei Lupu, Doina Precup

In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999).

Robo-PlaNet: Learning to Poke in a Day

no code implementations9 Nov 2019 Maxime Chevalier-Boisvert, Guillaume Alain, Florian Golemo, Derek Nowrouzezahrai

Recently, the Deep Planning Network (PlaNet) approach was introduced as a model-based reinforcement learning method that learns environment dynamics directly from pixel observations.

Model-based Reinforcement Learning Position +1

BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning

6 code implementations ICLR 2019 Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts.

Grounded language learning

Cannot find the paper you are looking for? You can Submit a new open access paper.