no code implementations • ICLR 2022 • Chandrasekar Subramanian, Balaraman Ravindran
We study a contextual bandit setting where the learning agent has the ability to perform interventions on targeted subsets of the population, apart from possessing qualitative causal side-information.