Search Results for author: Chinmaya Kausik

Found 4 papers, 1 papers with code

A Framework for Partially Observed Reward-States in RLHF

no code implementations5 Feb 2024 Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari

We show reductions from the the two dominant forms of human feedback in RLHF - cardinal and dueling feedback to PORRL.

reinforcement-learning

Offline Policy Evaluation and Optimization under Confounding

no code implementations29 Nov 2022 Chinmaya Kausik, Yangyi Lu, Kevin Tan, Maggie Makar, Yixin Wang, Ambuj Tewari

Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.

Offline RL Off-policy evaluation

Learning Mixtures of Markov Chains and MDPs

1 code implementation17 Nov 2022 Chinmaya Kausik, Kevin Tan, Ambuj Tewari

We present an algorithm for learning mixtures of Markov chains and Markov decision processes (MDPs) from short unlabeled trajectories.

Cannot find the paper you are looking for? You can Submit a new open access paper.