Search Results for author: Bogdan Mazoure

Found 21 papers, 8 papers with code

Alpha-Divergences in Variational Dropout

no code implementations12 Nov 2017 Bogdan Mazoure, Riashat Islam

We investigate the use of alternative divergences to Kullback-Leibler (KL) in variational inference(VI), based on the Variational Dropout \cite{kingma2015}.

Variational Inference

GAN Q-learning

1 code implementation13 May 2018 Thang Doan, Bogdan Mazoure, Clare Lyle

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation.

Distributional Reinforcement Learning OpenAI Gym +3

On-line Adaptative Curriculum Learning for GANs

3 code implementations31 Jul 2018 Thang Doan, Joao Monteiro, Isabela Albuquerque, Bogdan Mazoure, Audrey Durand, Joelle Pineau, R. Devon Hjelm

We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support.

Multi-Armed Bandits Stochastic Optimization

EmojiGAN: learning emojis distributions with a generative model

no code implementations WS 2018 Bogdan Mazoure, Thang Doan, Saibal Ray

Generative models have recently experienced a surge in popularity due to the development of more efficient training algorithms and increasing computational power.

Image Captioning Style Transfer +1

Leveraging exploration in off-policy algorithms via normalizing flows

1 code implementation16 May 2019 Bogdan Mazoure, Thang Doan, Audrey Durand, R. Devon Hjelm, Joelle Pineau

The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in many real-world scenarios.

Continuous Control Reinforcement Learning (RL)

Learning Gaussian Graphical Models with Ordered Weighted L1 Regularization

1 code implementation6 Jun 2019 Cody Mazza-Anthony, Bogdan Mazoure, Mark Coates

We propose two novel estimators based on the Ordered Weighted $\ell_1$ (OWL) norm: 1) The Graphical OWL (GOWL) is a penalized likelihood method that applies the OWL norm to the lower triangle components of the precision matrix.

Computational Efficiency

Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning

no code implementations17 Sep 2019 Thang Doan, Bogdan Mazoure, Moloud Abdar, Audrey Durand, Joelle Pineau, R. Devon Hjelm

Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensional state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions.

Continuous Control reinforcement-learning +1

Deep Reinforcement and InfoMax Learning

1 code implementation NeurIPS 2020 Bogdan Mazoure, Remi Tachet des Combes, Thang Doan, Philip Bachman, R. Devon Hjelm

We begin with the hypothesis that a model-free agent whose representations are predictive of properties of future states (beyond expected rewards) will be more capable of solving and adapting to new RL problems.

Continual Learning

A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix

2 code implementations7 Oct 2020 Thang Doan, Mehdi Bennani, Bogdan Mazoure, Guillaume Rabusseau, Pierre Alquier

Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime.

Continual Learning

Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL

1 code implementation ICLR 2022 Bogdan Mazoure, Ahmed M. Ahmed, Patrick MacAlpine, R Devon Hjelm, Andrey Kolobov

A highly desirable property of a reinforcement learning (RL) agent -- and a major difficulty for deep RL approaches -- is the ability to generalize policies learned on a few tasks over a high-dimensional observation space to similar tasks not seen during training.

Reinforcement Learning (RL) Representation Learning +1

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

no code implementations29 Nov 2021 Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

We show that performance of online algorithms for generalization in RL can be hindered in the offline setting due to poor estimation of similarity between observations.

Contrastive Learning Decision Making +5

The Sandbox Environment for Generalizable Agent Research (SEGAR)

1 code implementation19 Mar 2022 R Devon Hjelm, Bogdan Mazoure, Florian Golemo, Felipe Frujeri, Mihai Jalobeanu, Andrey Kolobov

A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress.

Decision Making

Sequential Density Estimation via Nonlinear Continuous Weighted Finite Automata

no code implementations8 Jun 2022 Tianyu Li, Bogdan Mazoure, Guillaume Rabusseau

Although WFAs have been extended to deal with continuous input data, namely continuous WFAs (CWFAs), it is still unclear how to approximate density functions over sequences of continuous random variables using WFA-based models, due to the limitation on the expressiveness of the model as well as the tractability of approximating density functions via CWFAs.

Density Estimation

Contrastive Value Learning: Implicit Models for Simple Offline RL

no code implementations3 Nov 2022 Bogdan Mazoure, Benjamin Eysenbach, Ofir Nachum, Jonathan Tompson

In this paper, we propose Contrastive Value Learning (CVL), which learns an implicit, multi-step model of the environment dynamics.

Continuous Control Model-based Reinforcement Learning +2

Accelerating exploration and representation learning with offline pre-training

no code implementations31 Mar 2023 Bogdan Mazoure, Jake Bruce, Doina Precup, Rob Fergus, Ankit Anand

In this work, we follow the hypothesis that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.

Decision Making NetHack +2

Value function estimation using conditional diffusion models for control

no code implementations9 Jun 2023 Bogdan Mazoure, Walter Talbott, Miguel Angel Bautista, Devon Hjelm, Alexander Toshev, Josh Susskind

A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data.

Continuous Control

Cannot find the paper you are looking for? You can Submit a new open access paper.