Search Results for author: Raihan Seraj

Found 6 papers, 2 papers with code

PcLast: Discovering Plannable Continuous Latent States

no code implementations6 Nov 2023 Anurag Koul, Shivakanth Sujit, Shaoru Chen, Ben Evans, Lili Wu, Byron Xu, Rajan Chari, Riashat Islam, Raihan Seraj, Yonathan Efroni, Lekan Molu, Miro Dudik, John Langford, Alex Lamb

Goal-conditioned planning benefits from learned low-dimensional representations of rich, high-dimensional observations.

Tsetlin Machine for Solving Contextual Bandit Problems

1 code implementation4 Feb 2022 Raihan Seraj, Jivitesh Sharma, Ole-Christoffer Granmo

This paper introduces an interpretable contextual bandit algorithm using Tsetlin Machines, which solves complex pattern recognition tasks using propositional logic.

Thompson Sampling

Approximate information state for approximate planning and reinforcement learning in partially observed systems

1 code implementation17 Oct 2020 Jayakumar Subramanian, Amit Sinha, Raihan Seraj, Aditya Mahajan

Our key result is to show that if a function of the history (called approximate information state (AIS)) approximately satisfies the properties of the information state, then there is a corresponding approximate dynamic program.

reinforcement-learning Reinforcement Learning (RL)

Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning

no code implementations11 Dec 2019 Riashat Islam, Raihan Seraj, Samin Yeasar Arnob, Doina Precup

Furthermore, in cases where the reward function is stochastic that can lead to high variance, doubly robust critic estimation can improve performance under corrupted, stochastic reward signals, indicating its usefulness for robust and safe reinforcement learning.

Continuous Control reinforcement-learning +2

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods

no code implementations11 Dec 2019 Riashat Islam, Raihan Seraj, Pierre-Luc Bacon, Doina Precup

In this work, we propose exploration in policy gradient methods based on maximizing entropy of the discounted future state distribution.

Policy Gradient Methods

Cannot find the paper you are looking for? You can Submit a new open access paper.