no code implementations • 19 Feb 2024 • Niclas Boehmer, Yash Nair, Sanket Shah, Lucas Janson, Aparna Taneja, Milind Tambe
When resources are scarce, an allocation policy is needed to decide who receives a resource.
no code implementations • 22 Sep 2021 • Yash Nair, Nan Jiang
We consider off-policy evaluation (OPE) in Partially Observable Markov Decision Processes, where the evaluation policy depends only on observable variables but the behavior policy depends on latent states (Tennenholtz et al. (2020a)).
no code implementations • 11 Jun 2020 • Yash Nair, Finale Doshi-Velez
First, we derive sample complexity bounds for DPL, and then show that model-based learning from expert actions can, even with a finite model class, be impossible.