1 code implementation • 10 Mar 2023 • Manan Tomar, Riashat Islam, Sergey Levine, Philip Bachman
Informational parsimony -- i. e., using the minimal information required for a task, -- provides a useful inductive bias for learning representations that achieve better generalization by being robust to noise and spurious correlations.
no code implementations • 29 Dec 2022 • Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Animesh Garg, Zhaoran Wang, Lihong Li, Doina Precup
Learning policies from fixed offline datasets is a key challenge to scale up reinforcement learning (RL) algorithms towards practical applications.
no code implementations • 28 Dec 2022 • Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb
Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations.
1 code implementation • 2 Nov 2022 • Hongyu Zang, Xin Li, Jie Yu, Chen Liu, Riashat Islam, Remi Tachet des Combes, Romain Laroche
Our method, Behavior Prior Representation (BPR), learns state representations with an easy-to-integrate objective based on behavior cloning of the dataset: we first learn a state representation by mimicking actions from the dataset, and then train a policy on top of the fixed representation, using any off-the-shelf Offline RL algorithm.
no code implementations • 1 Nov 2022 • Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes
Goal-conditioned reinforcement learning (RL) is a promising direction for training agents that are capable of solving multiple tasks and reach a diverse set of objectives.
1 code implementation • 31 Oct 2022 • Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford
We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process, which is prevalent in practical applications.
no code implementations • 17 Jul 2022 • Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford
In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information.
no code implementations • 31 Dec 2021 • Samin Yeasar Arnob, Riashat Islam, Doina Precup
We hypothesize that empirically studying the sample complexity of offline reinforcement learning (RL) is crucial for the practical applications of RL in the real world.
no code implementations • 1 Jan 2021 • Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Zhaoran Wang, Animesh Garg, Lihong Li, Doina Precup
Learning policies from fixed offline datasets is a key challenge to scale up reinforcement learning (RL) algorithms towards practical applications.
no code implementations • 11 Dec 2019 • Riashat Islam, Zafarali Ahmed, Doina Precup
Entropy regularization is used to get improved optimization performance in reinforcement learning tasks.
no code implementations • 11 Dec 2019 • Riashat Islam, Raihan Seraj, Samin Yeasar Arnob, Doina Precup
Furthermore, in cases where the reward function is stochastic that can lead to high variance, doubly robust critic estimation can improve performance under corrupted, stochastic reward signals, indicating its usefulness for robust and safe reinforcement learning.
no code implementations • 11 Dec 2019 • Riashat Islam, Raihan Seraj, Pierre-Luc Bacon, Doina Precup
In this work, we propose exploration in policy gradient methods based on maximizing entropy of the discounted future state distribution.
no code implementations • 16 Nov 2019 • Riashat Islam, Komal K. Teru, Deepak Sharma, Joelle Pineau
This data distribution shift between current and past samples can significantly impact the performance of most modern off-policy based policy optimization algorithms.
no code implementations • 9 Jun 2019 • Disha Shrivastava, Eeshan Gunesh Dhekane, Riashat Islam
Exploration and adaptation to new tasks in a transfer learning setup is a central challenge in reinforcement learning.
no code implementations • ICLR 2019 • Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew Botvinick, Yoshua Bengio, Sergey Levine
In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
no code implementations • 30 Jan 2019 • Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Matthew Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine
In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
3 code implementations • 30 Nov 2018 • Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, Joelle Pineau
Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning.
no code implementations • 27 Nov 2018 • Arash Tavakoli, Vitaly Levdik, Riashat Islam, Christopher M. Smith, Petar Kormushev
We consider the generic approach of using an experience memory to help exploration by adapting a restart distribution.
1 code implementation • 11 Jul 2018 • Philip Bachman, Riashat Islam, Alessandro Sordoni, Zafarali Ahmed
We introduce a deep generative model for functions.
1 code implementation • 6 Dec 2017 • Peter Henderson, Thang Doan, Riashat Islam, David Meger
Policy gradient methods have had great success in solving continuous control tasks, yet the stochastic nature of such problems makes deterministic value estimation difficult.
no code implementations • 12 Nov 2017 • Bogdan Mazoure, Riashat Islam
We investigate the use of alternative divergences to Kullback-Leibler (KL) in variational inference(VI), based on the Variational Dropout \cite{kingma2015}.
no code implementations • ICLR 2018 • David Krueger, Chin-wei Huang, Riashat Islam, Ryan Turner, Alexandre Lacoste, Aaron Courville
We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks.
5 code implementations • 19 Sep 2017 • Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger
In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL).
1 code implementation • 10 Aug 2017 • Riashat Islam, Peter Henderson, Maziar Gomrokchi, Doina Precup
We investigate and discuss: the significance of hyper-parameters in policy gradients for continuous control, general variance in the algorithms, and reproducibility of reported results.
4 code implementations • ICML 2017 • Yarin Gal, Riashat Islam, Zoubin Ghahramani
In this paper we combine recent advances in Bayesian deep learning into the active learning framework in a practical way.