Search Results for author: Esther Derman

Found 9 papers, 3 papers with code

Tree Search-Based Policy Optimization under Stochastic Execution Delay

1 code implementation8 Apr 2024 David Valensi, Esther Derman, Shie Mannor, Gal Dalal

We show that given observed delay values, it is sufficient to perform a policy search in the class of Markov policies in order to reach optimal performance, thus extending the deterministic fixed delay case.

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

no code implementations3 Sep 2023 Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor

In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set.

Twice Regularized Markov Decision Processes: The Equivalence between Robustness and Regularization

1 code implementation12 Mar 2023 Esther Derman, Yevgeniy Men, Matthieu Geist, Shie Mannor

We then generalize regularized MDPs to twice regularized MDPs ($\text{R}^2$ MDPs), i. e., MDPs with $\textit{both}$ value and policy regularization.

Twice regularized MDPs and the equivalence between robustness and regularization

no code implementations NeurIPS 2021 Esther Derman, Matthieu Geist, Shie Mannor

We finally generalize regularized MDPs to twice regularized MDPs (R${}^2$ MDPs), i. e., MDPs with $\textit{both}$ value and policy regularization.

Acting in Delayed Environments with Non-Stationary Markov Policies

2 code implementations ICLR 2021 Esther Derman, Gal Dalal, Shie Mannor

We introduce a framework for learning and planning in MDPs where the decision-maker commits actions that are executed with a delay of $m$ steps.

Cloud Computing Q-Learning

Distributional Robustness and Regularization in Reinforcement Learning

no code implementations5 Mar 2020 Esther Derman, Shie Mannor

Distributionally Robust Optimization (DRO) has enabled to prove the equivalence between robustness and regularization in classification and regression, thus providing an analytical reason why regularization generalizes well in statistical learning.

Decision Making reinforcement-learning +1

A Bayesian Approach to Robust Reinforcement Learning

no code implementations20 May 2019 Esther Derman, Daniel Mankowitz, Timothy Mann, Shie Mannor

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior.

reinforcement-learning Reinforcement Learning (RL) +1

Soft-Robust Actor-Critic Policy-Gradient

no code implementations11 Mar 2018 Esther Derman, Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

It learns an optimal policy with respect to a distribution over an uncertainty set and stays robust to model uncertainty but avoids the conservativeness of robust strategies.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.