Search Results for author: Chris Harris

Found 6 papers, 1 papers with code

Understanding and Leveraging Overparameterization in Recursive Value Estimation

no code implementations ICLR 2022 Chenjun Xiao, Bo Dai, Jincheng Mei, Oscar A Ramirez, Ramki Gummadi, Chris Harris, Dale Schuurmans

To better understand the utility of deep models in RL we present an analysis of recursive value estimation using overparameterized linear representations that provides useful, transferable findings.

Reinforcement Learning (RL) Value prediction

Joint Shapley values: a measure of joint feature importance

1 code implementation ICLR 2022 Chris Harris, Richard Pymar, Colin Rowat

The Shapley value is one of the most widely used measures of feature importance partly as it measures a feature's average effect on a model's prediction.

Feature Importance

A maximum-entropy approach to off-policy evaluation in average-reward MDPs

no code implementations NeurIPS 2020 Nevena Lazic, Dong Yin, Mehrdad Farajtabar, Nir Levine, Dilan Gorur, Chris Harris, Dale Schuurmans

This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs).

Off-policy evaluation

RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real

no code implementations CVPR 2020 Kanishka Rao, Chris Harris, Alex Irpan, Sergey Levine, Julian Ibarz, Mohi Khansari

However, this sort of translation is typically task-agnostic, in that the translated images may not preserve all features that are relevant to the task.

reinforcement-learning Reinforcement Learning (RL) +2

Surrogate Objectives for Batch Policy Optimization in One-step Decision Making

no code implementations NeurIPS 2019 Minmin Chen, Ramki Gummadi, Chris Harris, Dale Schuurmans

We investigate batch policy optimization for cost-sensitive classification and contextual bandits---two related tasks that obviate exploration but require generalizing from observed rewards to action selections in unseen contexts.

Decision Making Multi-Armed Bandits

Off-Policy Evaluation via Off-Policy Classification

no code implementations NeurIPS 2019 Alex Irpan, Kanishka Rao, Konstantinos Bousmalis, Chris Harris, Julian Ibarz, Sergey Levine

However, for high-dimensional observations, such as images, models of the environment can be difficult to fit and value-based methods can make IS hard to use or even ill-conditioned, especially when dealing with continuous action spaces.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.