no code implementations • 29 Dec 2022 • Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Animesh Garg, Zhaoran Wang, Lihong Li, Doina Precup
Learning policies from fixed offline datasets is a key challenge to scale up reinforcement learning (RL) algorithms towards practical applications.
no code implementations • 28 Dec 2022 • Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb
Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations.
no code implementations • 31 Dec 2021 • Samin Yeasar Arnob, Riyasat Ohib, Sergey Plis, Doina Precup
We leverage a fixed dataset to prune neural networks before the start of RL training.
no code implementations • 31 Dec 2021 • Samin Yeasar Arnob, Riashat Islam, Doina Precup
We hypothesize that empirically studying the sample complexity of offline reinforcement learning (RL) is crucial for the practical applications of RL in the real world.
no code implementations • 1 Jan 2021 • Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Zhaoran Wang, Animesh Garg, Lihong Li, Doina Precup
Learning policies from fixed offline datasets is a key challenge to scale up reinforcement learning (RL) algorithms towards practical applications.
1 code implementation • ICML Workshop LifelongML 2020 • Samin Yeasar Arnob
Adversarial Imitation Learning (AIL) is a class of algorithms in Reinforcement learning (RL), which tries to imitate an expert without taking any reward from the environment and does not provide expert behavior directly to the policy training.
no code implementations • 11 Dec 2019 • Riashat Islam, Raihan Seraj, Samin Yeasar Arnob, Doina Precup
Furthermore, in cases where the reward function is stochastic that can lead to high variance, doubly robust critic estimation can improve performance under corrupted, stochastic reward signals, indicating its usefulness for robust and safe reinforcement learning.