Search Results for author: Andrew Bennett

Found 18 papers, 9 papers with code

Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes

no code implementations29 Mar 2024 Andrew Bennett, Nathan Kallus, Miruna Oprescu, Wen Sun, Kaiwen Wang

We characterize the sharp bounds on policy value under this model, that is, the tightest possible bounds given by the transition observations from the original MDP, and we study the estimation of these bounds from such transition observations.

Off-policy evaluation

Low-Rank MDPs with Continuous Action Spaces

no code implementations6 Nov 2023 Andrew Bennett, Nathan Kallus, Miruna Oprescu

Low-Rank Markov Decision Processes (MDPs) have recently emerged as a promising framework within the domain of reinforcement learning (RL), as they allow for provably approximately correct (PAC) learning guarantees while also incorporating ML algorithms for representation learning.

PAC learning Reinforcement Learning (RL) +1

Source Condition Double Robust Inference on Functionals of Inverse Problems

no code implementations25 Jul 2023 Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

We consider estimation of parameters defined as linear functionals of solutions to linear inverse problems.

Provable Safe Reinforcement Learning with Binary Feedback

1 code implementation26 Oct 2022 Andrew Bennett, Dipendra Misra, Nathan Kallus

Many existing approaches to safe RL rely on receiving numeric safety feedback, but in many cases this feedback can only take binary values; that is, whether an action in a given state is safe or unsafe.

Active Learning reinforcement-learning +2

Inference on Strongly Identified Functionals of Weakly Identified Functions

no code implementations17 Aug 2022 Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

In a variety of applications, including nonparametric instrumental variable (NPIV) analysis, proximal causal inference under unmeasured confounding, and missing-not-at-random data with shadow variables, we are interested in inference on a continuous linear functional (e. g., average causal effects) of nuisance function (e. g., NPIV regression) defined by conditional moment restrictions.

Causal Inference regression +1

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

1 code implementation NeurIPS 2023 Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun

Finally, we extend our methods to learning of dynamics and establish the connection between our approach and the well-known spectral learning methods in POMDPs.

Off-policy evaluation

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

1 code implementation28 Oct 2021 Andrew Bennett, Nathan Kallus

To answer these, we extend the framework of proximal causal inference to our POMDP setting, providing a variety of settings where identification is made possible by the existence of so-called bridge functions.

Causal Inference Management +2

The Variational Method of Moments

2 code implementations17 Dec 2020 Andrew Bennett, Nathan Kallus

The conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables, a prominent example being instrumental variable regression.


Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders

no code implementations27 Jul 2020 Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi

We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders, where states and actions can act as proxies for the unobserved confounders.

Off-policy evaluation reinforcement-learning

Efficient Policy Learning from Surrogate-Loss Classification Reductions

1 code implementation ICML 2020 Andrew Bennett, Nathan Kallus

We show that, under a correct specification assumption, the weighted classification formulation need not be efficient for policy parameters.

Binary Classification Classification +1

Policy Evaluation with Latent Confounders via Optimal Balance

1 code implementation NeurIPS 2019 Andrew Bennett, Nathan Kallus

We study the question of policy evaluation when we instead have proxies for the latent confounders and develop an importance weighting method that avoids fitting a latent outcome regression model.


Deep Generalized Method of Moments for Instrumental Variable Analysis

2 code implementations NeurIPS 2019 Andrew Bennett, Nathan Kallus, Tobias Schnabel

Instrumental variable analysis is a powerful tool for estimating causal effects when randomization or full control of confounders is not possible.

Model Selection

Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning

1 code implementation31 May 2018 Valts Blukis, Nataly Brukhim, Andrew Bennett, Ross A. Knepper, Yoav Artzi

We introduce a method for following high-level navigation instructions by mapping directly from images, instructions and pose estimates to continuous low-level velocity commands for real-time control.

Imitation Learning Instruction Following

Cannot find the paper you are looking for? You can Submit a new open access paper.