Search Results for author: Melrose Roderick

Found 7 papers, 4 papers with code

Mean Actor Critic

2 code implementations1 Sep 2017 Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.

Atari Games reinforcement-learning +1

Deep Abstract Q-Networks

no code implementations2 Oct 2017 Melrose Roderick, Christopher Grimm, Stefanie Tellex

We examine the problem of learning and planning on high-dimensional domains with long horizons and sparse rewards.

Montezuma's Revenge

Implementing the Deep Q-Network

1 code implementation20 Nov 2017 Melrose Roderick, James Macglashan, Stefanie Tellex

The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research.

Atari Games

Provably Safe PAC-MDP Exploration Using Analogies

1 code implementation7 Jul 2020 Melrose Roderick, Vaishnavh Nagarajan, J. Zico Kolter

A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).

reinforcement-learning Reinforcement Learning (RL) +1

Enforcing robust control guarantees within neural network policies

1 code implementation ICLR 2021 Priya L. Donti, Melrose Roderick, Mahyar Fazlyab, J. Zico Kolter

When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance.

Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning

no code implementations25 Nov 2023 Melrose Roderick, Gaurav Manek, Felix Berkenkamp, J. Zico Kolter

A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy.

Q-Learning Reinforcement Learning (RL)

Generative Posterior Networks for Approximately Bayesian Epistemic Uncertainty Estimation

no code implementations29 Dec 2023 Melrose Roderick, Felix Berkenkamp, Fatemeh Sheikholeslami, Zico Kolter

In many real-world problems, there is a limited set of training data, but an abundance of unlabeled data.

Cannot find the paper you are looking for? You can Submit a new open access paper.