Search Results for author: Mohak Bhardwaj

Found 8 papers, 1 papers with code

Adversarial Model for Offline Reinforcement Learning

no code implementations NeurIPS 2023 Mohak Bhardwaj, Tengyang Xie, Byron Boots, Nan Jiang, Ching-An Cheng

We propose a novel model-based offline Reinforcement Learning (RL) framework, called Adversarial Model for Offline Reinforcement Learning (ARMOR), which can robustly learn policies to improve upon an arbitrary reference policy regardless of data coverage.

reinforcement-learning Reinforcement Learning (RL)

ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data

no code implementations8 Nov 2022 Tengyang Xie, Mohak Bhardwaj, Nan Jiang, Ching-An Cheng

We propose a new model-based offline RL framework, called Adversarial Models for Offline Reinforcement Learning (ARMOR), which can robustly learn policies to improve upon an arbitrary baseline policy regardless of data coverage.

Offline RL

Leveraging Experience in Lazy Search

no code implementations10 Oct 2021 Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots, Siddhartha Srinivasa

If new search problems are sufficiently similar to problems solved during training, the learned policy will choose a good edge evaluation ordering and solve the motion planning problem quickly.

Imitation Learning Motion Planning

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

no code implementations ICLR 2021 Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

We further propose an algorithm that changes $\lambda$ over time to reduce the dependence on MPC as our estimates of the value function improve, and test the efficacy our approach on challenging high-dimensional manipulation tasks with biased models in simulation.

Model Predictive Control reinforcement-learning +1

Information Theoretic Model Predictive Q-Learning

no code implementations31 Dec 2019 Mohak Bhardwaj, Ankur Handa, Dieter Fox, Byron Boots

Model-free Reinforcement Learning (RL) works well when experience can be collected cheaply and model-based RL is effective when system dynamics can be modeled accurately.

Decision Making Model Predictive Control +3

Leveraging Experience in Lazy Search

no code implementations16 Jul 2019 Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots, Siddhartha Srinivasa

If new search problems are sufficiently similar to problems solved during training, the learned policy will choose a good edge evaluation ordering and solve the motion planning problem quickly.

Imitation Learning Motion Planning

Learning Heuristic Search via Imitation

1 code implementation10 Jul 2017 Mohak Bhardwaj, Sanjiban Choudhury, Sebastian Scherer

In this paper, we do so by training a heuristic policy that maps the partial information from the search to decide which node of the search tree to expand.

Motion Planning valid

Cannot find the paper you are looking for? You can Submit a new open access paper.