1 code implementation • NeurIPS 2023 • Hanlin Zhu, Paria Rashidinejad, Jiantao Jiao
We propose A-Crab (Actor-Critic Regularized by Average Bellman error), a new practical algorithm for offline reinforcement learning (RL) in complex environments with insufficient data coverage.
no code implementations • 1 Nov 2022 • Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao
Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years.
1 code implementation • NeurIPS 2021 • Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph Gonzalez, Stuart Russell
As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies.
no code implementations • NeurIPS 2021 • Paria Rashidinejad, Banghua Zhu, Cong Ma, Jiantao Jiao, Stuart Russell
Based on the composition of the offline dataset, two main categories of methods are used: imitation learning which is suitable for expert datasets and vanilla offline RL which often requires uniform coverage datasets.
1 code implementation • NeurIPS 2020 • Paria Rashidinejad, Jiantao Jiao, Stuart Russell
Our theoretical and experimental results shed light on the conditions required for efficient probably approximately correct (PAC) learning of the Kalman filter from partially observed data.