no code implementations • 11 Dec 2024 • Aishwarya Mandyam, Shengpu Tang, Jiayu Yao, Jenna Wiens, Barbara E. Engelhardt
Our empirical results show that when the reward model is misspecified and the annotations are imperfect, it is most beneficial to use the annotations only in the DM portion of a DR estimator.
1 code implementation • 16 Nov 2023 • Aishwarya Mandyam, Matthew Jörke, William Denton, Barbara E. Engelhardt, Emma Brunskill
Tailoring advice to a person's unique goals, preferences, and life circumstances is a critical component of health coaching that has been underutilized in adaptive algorithms for mobile health interventions.
1 code implementation • 13 Mar 2023 • Aishwarya Mandyam, Didong Li, Diana Cai, Andrew Jones, Barbara E. Engelhardt
In this work, we incorporate existing domain-specific data to achieve better posterior concentration rates.
no code implementations • pproximateinference AABI Symposium 2022 • Aishwarya Mandyam, Didong Li, Diana Cai, Andrew Jones, Barbara Engelhardt
Inverse reinforcement learning (IRL) methods attempt to recover the reward function of an agent by observing its behavior.
no code implementations • 6 Oct 2021 • Aishwarya Mandyam, Andrew Jones, Jiayu Yao, Krzysztof Laudanski, Barbara Engelhardt
CFQI uses a compositional $Q$-value function with separate modules for each task variant, allowing it to take advantage of shared knowledge while learning distinct policies for each variant.
no code implementations • 29 Sep 2021 • Aishwarya Mandyam, Andrew Jones, Krzysztof Laudanski, Barbara Engelhardt
Off-policy reinforcement learning (RL) has proven to be a powerful framework for guiding agents' actions in environments with stochastic rewards and unknown or noisy state dynamics.