Synthetic Interventions
Consider a setting with $N$ heterogeneous units (e.g., individuals, sub-populations) and $D$ interventions (e.g., socio-economic policies). Our goal is to learn the expected potential outcome associated with every intervention on every unit, totaling $N \times D$ causal parameters. Towards this, we present a causal framework, synthetic interventions (SI), to infer these $N \times D$ causal parameters while only observing each of the $N$ units under at most two interventions, independent of $D$. This can be significant as the number of interventions, i.e., level of personalization, grows. Under a novel tensor factor model across units, outcomes, and interventions, we prove an identification result for each of these $N \times D$ causal parameters, establish finite-sample consistency of our estimator along with asymptotic normality under additional conditions. Importantly, our estimator also allows for latent confounders that determine how interventions are assigned. The estimator is further furnished with data-driven tests to examine its suitability. Empirically, we validate our framework through a large-scale A/B test performed on an e-commerce platform. We believe our results could have implications for the design of data-efficient randomized experiments (e.g., randomized control trials) with heterogeneous units and multiple interventions.
PDF Abstract