Search Results for author: Paria Rashidinejad

Found 5 papers, 3 papers with code

Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning

1 code implementation NeurIPS 2023 Hanlin Zhu, Paria Rashidinejad, Jiantao Jiao

We propose A-Crab (Actor-Critic Regularized by Average Bellman error), a new practical algorithm for offline reinforcement learning (RL) in complex environments with insufficient data coverage.

reinforcement-learning Reinforcement Learning (RL)

Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian

no code implementations1 Nov 2022 Paria Rashidinejad, Hanlin Zhu, Kunhe Yang, Stuart Russell, Jiantao Jiao

Offline reinforcement learning (RL), which refers to decision-making from a previously-collected dataset of interactions, has received significant attention over the past years.

Decision Making Offline RL +2

MADE: Exploration via Maximizing Deviation from Explored Regions

1 code implementation NeurIPS 2021 Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao, Yuandong Tian, Joseph Gonzalez, Stuart Russell

As a proof of concept, we evaluate the new intrinsic reward on tabular examples across a variety of model-based and model-free algorithms, showing improvements over count-only exploration strategies.

Efficient Exploration Reinforcement Learning (RL)

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

no code implementations NeurIPS 2021 Paria Rashidinejad, Banghua Zhu, Cong Ma, Jiantao Jiao, Stuart Russell

Based on the composition of the offline dataset, two main categories of methods are used: imitation learning which is suitable for expert datasets and vanilla offline RL which often requires uniform coverage datasets.

Imitation Learning Multi-Armed Bandits +3

SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory

1 code implementation NeurIPS 2020 Paria Rashidinejad, Jiantao Jiao, Stuart Russell

Our theoretical and experimental results shed light on the conditions required for efficient probably approximately correct (PAC) learning of the Kalman filter from partially observed data.

PAC learning

Cannot find the paper you are looking for? You can Submit a new open access paper.