Search Results for author: Harshit Sikchi

Found 8 papers, 5 papers with code

SMORE: Score Models for Offline Goal-Conditioned Reinforcement Learning

no code implementations3 Nov 2023 Harshit Sikchi, Rohan Chitnis, Ahmed Touati, Alborz Geramifard, Amy Zhang, Scott Niekum

Offline Goal-Conditioned Reinforcement Learning (GCRL) is tasked with learning to achieve multiple goals in an environment purely from offline datasets using sparse reward functions.

Contrastive Learning reinforcement-learning +1

Contrastive Preference Learning: Learning from Human Feedback without RL

1 code implementation20 Oct 2023 Joey Hejna, Rafael Rafailov, Harshit Sikchi, Chelsea Finn, Scott Niekum, W. Bradley Knox, Dorsa Sadigh

Thus, learning a reward function from feedback is not only based on a flawed assumption of human preference, but also leads to unwieldy optimization challenges that stem from policy gradients or bootstrapping in the RL phase.

reinforcement-learning Reinforcement Learning (RL)

Dual RL: Unification and New Methods for Reinforcement and Imitation Learning

1 code implementation16 Feb 2023 Harshit Sikchi, Qinqing Zheng, Amy Zhang, Scott Niekum

For offline RL, our analysis frames a recent offline RL method XQL in the dual framework, and we further propose a new method f-DVL that provides alternative choices to the Gumbel regression loss that fixes the known training instability issue of XQL.

Imitation Learning Offline RL +2

A Ranking Game for Imitation Learning

no code implementations7 Feb 2022 Harshit Sikchi, Akanksha Saran, Wonjoon Goo, Scott Niekum

We propose a new framework for imitation learning -- treating imitation as a two-player ranking-based game between a policy and a reward.

Imitation Learning

Lyapunov Barrier Policy Optimization

1 code implementation16 Mar 2021 Harshit Sikchi, Wenxuan Zhou, David Held

Current RL agents explore the environment without considering these constraints, which can lead to damage to the hardware or even other agents in the environment.

Reinforcement Learning (RL)

f-IRL: Inverse Reinforcement Learning via State Marginal Matching

1 code implementation9 Nov 2020 Tianwei Ni, Harshit Sikchi, YuFei Wang, Tejus Gupta, Lisa Lee, Benjamin Eysenbach

Our method outperforms adversarial imitation learning methods in terms of sample efficiency and the required number of expert trajectories on IRL benchmarks.

Imitation Learning reinforcement-learning +1

Learning Off-Policy with Online Planning

1 code implementation23 Aug 2020 Harshit Sikchi, Wenxuan Zhou, David Held

In this work, we investigate a novel instantiation of H-step lookahead with a learned model and a terminal value function learned by a model-free off-policy algorithm, named Learning Off-Policy with Online Planning (LOOP).

Continuous Control reinforcement-learning +1

Imitative Planning using Conditional Normalizing Flow

no code implementations31 Jul 2020 Shubhankar Agarwal, Harshit Sikchi, Cole Gulino, Eric Wilkinson, Shivam Gautam

A popular way to plan trajectories in dynamic urban scenarios for Autonomous Vehicles is to rely on explicitly specified and hand crafted cost functions, coupled with random sampling in the trajectory space to find the minimum cost trajectory.

Autonomous Vehicles Trajectory Planning

Cannot find the paper you are looking for? You can Submit a new open access paper.