Search Results for author: Daniel J Mankowitz

Found 2 papers, 1 papers with code

Active Offline Policy Selection

1 code implementation NeurIPS 2021 Ksenia Konyushkova, Yutian Chen, Tom Le Paine, Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas

We use multiple benchmarks, including real-world robotics, with a large number of candidate policies to show that the proposed approach improves upon state-of-the-art OPE estimates and pure online policy evaluation.

Bayesian Optimization Off-policy evaluation

Discovering a set of policies for the worst case reward

no code implementations ICLR 2021 Tom Zahavy, Andre Barreto, Daniel J Mankowitz, Shaobo Hou, Brendan O'Donoghue, Iurii Kemaev, Satinder Singh

Our main contribution is a policy iteration algorithm that builds a set of policies in order to maximize the worst-case performance of the resulting SMP on the set of tasks.

Cannot find the paper you are looking for? You can Submit a new open access paper.