no code implementations • 1 Jun 2021 • Bogdan Mazoure, Paul Mineiro, Pavithra Srinath, Reza Sharifi Sedeh, Doina Precup, Adith Swaminathan
Targeting immediately measurable proxies such as clicks can lead to suboptimal recommendations due to misalignment with the long-term metric.
1 code implementation • ICML 2020 • Yi Su, Pavithra Srinath, Akshay Krishnamurthy
We develop a generic data-driven method for estimator selection in off-policy policy evaluation settings.