There is intense interest in applying machine learning to problems of causal
inference in fields such as healthcare, economics and education. In particular,
individual-level causal inference has important applications such as precision
We give a new theoretical analysis and family of algorithms for
predicting individual treatment effect (ITE) from observational data, under the
assumption known as strong ignorability. The algorithms learn a "balanced"
representation such that the induced treated and control distributions look
similar. We give a novel, simple and intuitive generalization-error bound
showing that the expected ITE estimation error of a representation is bounded
by a sum of the standard generalization-error of that representation and the
distance between the treated and control distributions induced by the
representation. We use Integral Probability Metrics to measure distances
between distributions, deriving explicit bounds for the Wasserstein and Maximum
Mean Discrepancy (MMD) distances. Experiments on real and simulated data show
the new algorithms match or outperform the state-of-the-art.