Search Results for author: Bogdan Mazoure

Found 21 papers, 8 papers with code

Alpha-Divergences in Variational Dropout

no code implementations • 12 Nov 2017 • Bogdan Mazoure, Riashat Islam

We investigate the use of alternative divergences to Kullback-Leibler (KL) in variational inference(VI), based on the Variational Dropout \cite{kingma2015}.

Variational Inference

Paper
Add Code

GAN Q-learning

1 code implementation • 13 May 2018 • Thang Doan, Bogdan Mazoure, Clare Lyle

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation.

Distributional Reinforcement Learning OpenAI Gym +3

Paper
Code

On-line Adaptative Curriculum Learning for GANs

3 code implementations • 31 Jul 2018 • Thang Doan, Joao Monteiro, Isabela Albuquerque, Bogdan Mazoure, Audrey Durand, Joelle Pineau, R. Devon Hjelm

We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support.

Multi-Armed Bandits Stochastic Optimization

Paper
Code

EmojiGAN: learning emojis distributions with a generative model

no code implementations • WS 2018 • Bogdan Mazoure, Thang Doan, Saibal Ray

Generative models have recently experienced a surge in popularity due to the development of more efficient training algorithms and increasing computational power.

Image Captioning Style Transfer +1

Paper
Add Code

Exploring attention mechanism for acoustic-based classification of speech utterances into system-directed and non-system-directed

no code implementations • 1 Feb 2019 • Atta Norouzian, Bogdan Mazoure, Dermot Connolly, Daniel Willett

This would require the VA to have the capability to detect the speech that is being directed at it and respond accordingly.

General Classification

Paper
Add Code

Leveraging exploration in off-policy algorithms via normalizing flows

1 code implementation • 16 May 2019 • Bogdan Mazoure, Thang Doan, Audrey Durand, R. Devon Hjelm, Joelle Pineau

The ability to discover approximately optimal policies in domains with sparse rewards is crucial to applying reinforcement learning (RL) in many real-world scenarios.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Learning Gaussian Graphical Models with Ordered Weighted L1 Regularization

1 code implementation • 6 Jun 2019 • Cody Mazza-Anthony, Bogdan Mazoure, Mark Coates

We propose two novel estimators based on the Ordered Weighted $\ell_1$ (OWL) norm: 1) The Graphical OWL (GOWL) is a penalized likelihood method that applies the OWL norm to the lower triangle components of the precision matrix.

Computational Efficiency

Paper
Code

Attraction-Repulsion Actor-Critic for Continuous Control Reinforcement Learning

no code implementations • 17 Sep 2019 • Thang Doan, Bogdan Mazoure, Moloud Abdar, Audrey Durand, Joelle Pineau, R. Devon Hjelm

Continuous control tasks in reinforcement learning are important because they provide an important framework for learning in high-dimensional state spaces with deceptive rewards, where the agent can easily become trapped into suboptimal solutions.

Continuous Control reinforcement-learning +1

Paper
Add Code

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

no code implementations • 12 Nov 2019 • Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

Learning and planning in partially-observable domains is one of the most difficult problems in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces

no code implementations • 7 Feb 2020 • Bogdan Mazoure, Thang Doan, Tianyu Li, Vladimir Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau

We propose a general framework for policy representation for reinforcement learning tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Deep Reinforcement and InfoMax Learning

1 code implementation • NeurIPS 2020 • Bogdan Mazoure, Remi Tachet des Combes, Thang Doan, Philip Bachman, R. Devon Hjelm

We begin with the hypothesis that a model-free agent whose representations are predictive of properties of future states (beyond expected rewards) will be more capable of solving and adapting to new RL problems.

Continual Learning

Paper
Code

A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix

2 code implementations • 7 Oct 2020 • Thang Doan, Mehdi Bennani, Bogdan Mazoure, Guillaume Rabusseau, Pierre Alquier

Continual learning (CL) is a setting in which an agent has to learn from an incoming stream of data during its entire lifetime.

Continual Learning

Paper
Code

Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning

no code implementations • 1 Jun 2021 • Bogdan Mazoure, Paul Mineiro, Pavithra Srinath, Reza Sharifi Sedeh, Doina Precup, Adith Swaminathan

Targeting immediately measurable proxies such as clicks can lead to suboptimal recommendations due to misalignment with the long-term metric.

Offline RL reinforcement-learning +2

Paper
Add Code

Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL

1 code implementation • ICLR 2022 • Bogdan Mazoure, Ahmed M. Ahmed, Patrick MacAlpine, R Devon Hjelm, Andrey Kolobov

A highly desirable property of a reinforcement learning (RL) agent -- and a major difficulty for deep RL approaches -- is the ability to generalize policies learned on a few tasks over a high-dimensional observation space to similar tasks not seen during training.

Reinforcement Learning (RL) Representation Learning +1

Paper
Code

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

no code implementations • 29 Nov 2021 • Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

We show that performance of online algorithms for generalization in RL can be hindered in the offline setting due to poor estimation of similarity between observations.

Contrastive Learning Decision Making +5

Paper
Add Code

The Sandbox Environment for Generalizable Agent Research (SEGAR)

1 code implementation • 19 Mar 2022 • R Devon Hjelm, Bogdan Mazoure, Florian Golemo, Felipe Frujeri, Mihai Jalobeanu, Andrey Kolobov

A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress.

Decision Making

Paper
Code

Sequential Density Estimation via Nonlinear Continuous Weighted Finite Automata

no code implementations • 8 Jun 2022 • Tianyu Li, Bogdan Mazoure, Guillaume Rabusseau

Although WFAs have been extended to deal with continuous input data, namely continuous WFAs (CWFAs), it is still unclear how to approximate density functions over sequences of continuous random variables using WFA-based models, due to the limitation on the expressiveness of the model as well as the tractability of approximating density functions via CWFAs.

Density Estimation

Paper
Add Code

Contrastive Value Learning: Implicit Models for Simple Offline RL

no code implementations • 3 Nov 2022 • Bogdan Mazoure, Benjamin Eysenbach, Ofir Nachum, Jonathan Tompson

In this paper, we propose Contrastive Value Learning (CVL), which learns an implicit, multi-step model of the environment dynamics.

Continuous Control Model-based Reinforcement Learning +2

Paper
Add Code

Accelerating exploration and representation learning with offline pre-training

no code implementations • 31 Mar 2023 • Bogdan Mazoure, Jake Bruce, Doina Precup, Rob Fergus, Ankit Anand

In this work, we follow the hypothesis that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.

Decision Making NetHack +2

Paper
Add Code

Value function estimation using conditional diffusion models for control

no code implementations • 9 Jun 2023 • Bogdan Mazoure, Walter Talbott, Miguel Angel Bautista, Devon Hjelm, Alexander Toshev, Josh Susskind

A fairly reliable trend in deep reinforcement learning is that the performance scales with the number of parameters, provided a complimentary scaling in amount of training data.

Continuous Control

Paper
Add Code

Large Language Models as Generalizable Policies for Embodied Tasks

no code implementations • 26 Oct 2023 • Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev

We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks.

Language Modelling Large Language Model +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.