Search Results for author: Brahma S. Pavse

Found 4 papers, 1 papers with code

State-Action Similarity-Based Representations for Off-Policy Evaluation

1 code implementation NeurIPS 2023 Brahma S. Pavse, Josiah P. Hanna

Instead, in this paper, we seek to enhance the data-efficiency of FQE by first transforming the fixed dataset using a learned encoder, and then feeding the transformed dataset into FQE.

Off-policy evaluation Representation Learning

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces

no code implementations2 Jun 2023 Brahma S. Pavse, Matthew Zurek, Yudong Chen, Qiaomin Xie, Josiah P. Hanna

This latter objective is called stability and is especially important when the state space is unbounded, such that the states can be arbitrarily far from each other and the agent can drift far away from the desired states.

Attribute reinforcement-learning +1

Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State Abstraction

no code implementations14 Dec 2022 Brahma S. Pavse, Josiah P. Hanna

We consider the problem of off-policy evaluation (OPE) in reinforcement learning (RL), where the goal is to estimate the performance of an evaluation policy, $\pi_e$, using a fixed dataset, $\mathcal{D}$, collected by one or more policies that may be different from $\pi_e$.

Off-policy evaluation

Cannot find the paper you are looking for? You can Submit a new open access paper.