Search Results for author: Bhrij Patel

Found 5 papers, 1 papers with code

Global Optimality without Mixing Time Oracles in Average-reward RL via Multi-level Actor-Critic

no code implementations18 Mar 2024 Bhrij Patel, Wesley A. Suttle, Alec Koppel, Vaneet Aggarwal, Brian M. Sadler, Amrit Singh Bedi, Dinesh Manocha

In the context of average-reward reinforcement learning, the requirement for oracle knowledge of the mixing time, a measure of the duration a Markov chain under a fixed policy needs to achieve its stationary distribution-poses a significant challenge for the global convergence of policy gradient methods.

Policy Gradient Methods

Right Place, Right Time! Towards ObjectNav for Non-Stationary Goals

no code implementations14 Mar 2024 Vishnu Sashank Dorbala, Bhrij Patel, Amrit Singh Bedi, Dinesh Manocha

We address this concern by inferring results on two cases for object placement: one where the objects placed follow a routine or a path, and the other where they are placed at random.

Object Visual Grounding

Ada-NAV: Adaptive Trajectory Length-Based Sample Efficient Policy Learning for Robotic Navigation

no code implementations9 Jun 2023 Bhrij Patel, Kasun Weerakoon, Wesley A. Suttle, Alec Koppel, Brian M. Sadler, Tianyi Zhou, Amrit Singh Bedi, Dinesh Manocha

Trajectory length stands as a crucial hyperparameter within reinforcement learning (RL) algorithms, significantly contributing to the sample inefficiency in robotics applications.

Policy Gradient Methods reinforcement-learning +1

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

no code implementations28 Jan 2023 Wesley A. Suttle, Amrit Singh Bedi, Bhrij Patel, Brian M. Sadler, Alec Koppel, Dinesh Manocha

Many existing reinforcement learning (RL) methods employ stochastic gradient iteration on the back end, whose stability hinges upon a hypothesis that the data-generating process mixes exponentially fast with a rate parameter that appears in the step-size selection.

Reinforcement Learning (RL)

In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

1 code implementation8 May 2020 Caroline Wang, Bin Han, Bhrij Patel, Cynthia Rudin

We compared predictive performance and fairness of these models against two methods that are currently used in the justice system to predict pretrial recidivism: the Arnold PSA and COMPAS.

BIG-bench Machine Learning Fairness +1

Cannot find the paper you are looking for? You can Submit a new open access paper.