Search Results for author: Melrose Roderick

Found 7 papers, 4 papers with code

Mean Actor Critic

2 code implementations • 1 Sep 2017 • Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman

We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.

Ranked #1 on Continuous Control on Cart Pole (OpenAI Gym)

Atari Games reinforcement-learning +1

Paper
Code

Deep Abstract Q-Networks

no code implementations • 2 Oct 2017 • Melrose Roderick, Christopher Grimm, Stefanie Tellex

We examine the problem of learning and planning on high-dimensional domains with long horizons and sparse rewards.

Montezuma's Revenge

Paper
Add Code

Implementing the Deep Q-Network

1 code implementation • 20 Nov 2017 • Melrose Roderick, James Macglashan, Stefanie Tellex

The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research.

Atari Games

Paper
Code

Provably Safe PAC-MDP Exploration Using Analogies

1 code implementation • 7 Jul 2020 • Melrose Roderick, Vaishnavh Nagarajan, J. Zico Kolter

A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure).

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Enforcing robust control guarantees within neural network policies

1 code implementation • ICLR 2021 • Priya L. Donti, Melrose Roderick, Mahyar Fazlyab, J. Zico Kolter

When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance.

Paper
Code

Projected Off-Policy Q-Learning (POP-QL) for Stabilizing Offline Reinforcement Learning

no code implementations • 25 Nov 2023 • Melrose Roderick, Gaurav Manek, Felix Berkenkamp, J. Zico Kolter

A key problem in off-policy Reinforcement Learning (RL) is the mismatch, or distribution shift, between the dataset and the distribution over states and actions visited by the learned policy.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Generative Posterior Networks for Approximately Bayesian Epistemic Uncertainty Estimation

no code implementations • 29 Dec 2023 • Melrose Roderick, Felix Berkenkamp, Fatemeh Sheikholeslami, Zico Kolter

In many real-world problems, there is a limited set of training data, but an abundance of unlabeled data.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.