Search Results for author: Omer Gottesman

Found 18 papers, 6 papers with code

Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy

1 code implementation10 Jul 2024 Cameron Allen, Aaron Kirtland, Ruo Yu Tao, Sam Lobel, Daniel Scott, Nicholas Petrocelli, Omer Gottesman, Ronald Parr, Michael L. Littman, George Konidaris

Our metric, the $\lambda$-discrepancy, is the difference between two distinct temporal difference (TD) value estimates, each computed using TD($\lambda$) with a different value of $\lambda$.

Decision-Focused Model-based Reinforcement Learning for Reward Transfer

no code implementations6 Apr 2023 Abhishek Sharma, Sonali Parbhoo, Omer Gottesman, Finale Doshi-Velez

We also provide theoretical and empirical evidence, on a variety of simulators and real patient data, that RDF can learn simple yet effective models that can be used to plan personalized policies.

Decision Making Model-based Reinforcement Learning +2

On the Geometry of Reinforcement Learning in Continuous State and Action Spaces

no code implementations29 Dec 2022 Saket Tiwari, Omer Gottesman, George Konidaris

Central to our work is the idea that the transition dynamics induce a low dimensional manifold of reachable states embedded in the high-dimensional nominal state space.

reinforcement-learning Reinforcement Learning (RL)

A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes

no code implementations30 Jul 2022 Kelly W. Zhang, Omer Gottesman, Finale Doshi-Velez

In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments.

Decision Making reinforcement-learning +3

Faster Deep Reinforcement Learning with Slower Online Network

1 code implementation10 Dec 2021 Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L. Littman, Alexander J. Smola

In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network.

Deep Reinforcement Learning reinforcement-learning +1

Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation

no code implementations28 Nov 2021 Ramtin Keramati, Omer Gottesman, Leo Anthony Celi, Finale Doshi-Velez, Emma Brunskill

Off-policy policy evaluation methods for sequential decision making can be used to help identify if a proposed decision policy is better than a current baseline policy.

Decision Making Sequential Decision Making

Coarse-Grained Smoothness for RL in Metric Spaces

no code implementations23 Oct 2021 Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman

We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning.

Decision Making

State Relevance for Off-Policy Evaluation

1 code implementation13 Sep 2021 Simon P. Shen, Yecheng Jason Ma, Omer Gottesman, Finale Doshi-Velez

Importance sampling-based estimators for off-policy evaluation (OPE) are valued for their simplicity, unbiasedness, and reliance on relatively few assumptions.

Off-policy evaluation

Learning Markov State Abstractions for Deep Reinforcement Learning

1 code implementation NeurIPS 2021 Cameron Allen, Neev Parikh, Omer Gottesman, George Konidaris

A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov.

continuous-control Continuous Control +4

Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions

no code implementations ICML 2020 Omer Gottesman, Joseph Futoma, Yao Liu, Sonali Parbhoo, Leo Anthony Celi, Emma Brunskill, Finale Doshi-Velez

Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education, but safe deployment in high stakes settings requires ways of assessing its validity.

Off-policy evaluation reinforcement-learning +1

A general method for regularizing tensor decomposition methods via pseudo-data

no code implementations24 May 2019 Omer Gottesman, Weiwei Pan, Finale Doshi-Velez

Tensor decomposition methods allow us to learn the parameters of latent variable models through decomposition of low-order moments of data.

Tensor Decomposition Transfer Learning

Behaviour Policy Estimation in Off-Policy Policy Evaluation: Calibration Matters

no code implementations3 Jul 2018 Aniruddh Raghu, Omer Gottesman, Yao Liu, Matthieu Komorowski, Aldo Faisal, Finale Doshi-Velez, Emma Brunskill

In this work, we consider the problem of estimating a behaviour policy for use in Off-Policy Policy Evaluation (OPE) when the true behaviour policy is unknown.

Weighted Tensor Decomposition for Learning Latent Variables with Partial Data

no code implementations18 Oct 2017 Omer Gottesman, Weiwei Pan, Finale Doshi-Velez

Tensor decomposition methods are popular tools for learning latent variables given only lower-order moments of the data.

Tensor Decomposition

Cannot find the paper you are looking for? You can Submit a new open access paper.