Analyzing observational data from multiple sources can be useful for increasing statistical power to detect a treatment effect; however, practical constraints such as privacy considerations may restrict individual-level information sharing across data sets.
We study the problem of model selection for contextual bandits, in which the algorithm must balance the bias-variance trade-off for model estimation while also balancing the exploration-exploitation trade-off.
In particular, when the pattern of treatment assignment in the collected data looks little like the pattern generated by the policy to be evaluated, the importance weights used in DR estimators explode, leading to excessive variance.
We complement this regret upper bound with a lower bound that characterizes the fundamental difficulty of policy learning with adaptive data.
Computationally efficient contextual bandits are often based on estimating a predictive model of rewards given contexts and arms using past data.
When realizability does not hold, our algorithm ensures the same guarantees on regret achieved by realizability-based algorithms under realizability, up to an additive term that accounts for the misspecification error.
no code implementations • 21 Apr 2020 • Allison Koenecke, Michael Powell, Ruoxuan Xiong, Zhu Shen, Nicole Fischer, Sakibul Huq, Adham M. Khalafallah, Marco Trevisan, Pär Sparen, Juan J Carrero, Akihiko Nishimura, Brian Caffo, Elizabeth A. Stuart, Renyuan Bai, Verena Staedtke, David L. Thomas, Nickolas Papadopoulos, Kenneth W. Kinzler, Bert Vogelstein, Shibin Zhou, Chetan Bettegowda, Maximilian F. Konig, Brett Mensh, Joshua T. Vogelstein, Susan Athey
Here, we conducted retrospective analyses in two cohorts of patients with acute respiratory distress (ARD, n=18, 547) and three cohorts with pneumonia (n=400, 907).
Then, these weights are used in the weighted regression to improve the accuracy of estimation on the effect of each variable, thus help to improve the stability of prediction across unknown test data.
In the general setting where outcomes depend on latent covariates, we show that historical data can be utilized in designing experiments.
In this context, typical estimators that use inverse propensity weighting to eliminate sampling bias can be problematic: their distributions become skewed and heavy-tailed as the propensity scores decay to zero.
We discuss the use of Wasserstein Generative Adversarial Networks (WGANs) as a method for systematically generating artificial data that mimic closely any given real data set without the researcher having many degrees of freedom.
One source of the improvement is the ability of the model to accurately estimate heterogeneity in preferences (by pooling information across categories); another source of improvement is its ability to estimate the preferences of consumers who have rarely or never made a purchase in a given category in the training data.
This paper studies a panel data setting where the goal is to estimate causal effects of an intervention by predicting the counterfactual values of outcomes for treated units, had they not received the treatment.
We apply causal forests to a dataset derived from the National Study of Learning Mindsets, and consider resulting practical and conceptual challenges.
We present a new estimator for causal effects with panel data that builds on insights behind the widely used difference in differences and synthetic control methods.
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning.
We consider a game-theoretical multi-agent learning problem where the feedback information can be lost during the learning process and rewards are given by a broad class of games known as variationally stable games.
In many settings, a decision-maker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action.
Random forests are a powerful method for non-parametric regression, but are limited in their ability to fit smooth signals, and can show poor predictive performance in the presence of strong, smooth effects.
In this paper, we propose a novel Deep Global Balancing Regression (DGBR) algorithm to jointly optimize a deep auto-encoder model for feature selection and a global balancing model for stable prediction across unknown environments.
The data is used to identify users' approximate typical morning location, as well as their choices of lunchtime restaurants.
Embedding models consider the probability of a target observation (a word or an item) conditioned on the elements in the context (other words or items).
We develop parametric and non-parametric contextual bandits that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias.
In this paper we study methods for estimating causal effects in settings with panel data, where some units are exposed to a treatment during some periods and the goal is estimating counterfactual (untreated) outcomes for the treated unit/period combinations.
Statistics Theory Econometrics Statistics Theory
We derive standard errors that account for design-based uncertainty instead of, or in addition to, sampling-based uncertainty.
Statistics Theory Econometrics Statistics Theory
In many areas, practitioners seek to use observational data to learn a treatment assignment policy that satisfies application-specific constraints, such as budget, fairness, simplicity, or other functional form constraints.
We propose generalized random forests, a method for non-parametric statistical estimation based on random forests (Breiman, 2001) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations.
There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment assignment be as good as random conditional on pre-treatment variables.
Methodology Econometrics Statistics Theory Statistics Theory
We focus primarily on a setting with two samples, an experimental sample containing data about the treatment indicator and the surrogates and an observational sample containing information about the surrogates and the primary outcome.
Many scientific and engineering challenges -- ranging from personalized medicine to customized marketing recommendations -- require an understanding of treatment effect heterogeneity.
The challenge is that the "ground truth" for a causal effect is not observed for any individual unit: we observe the unit with the treatment, or without the treatment, but not both at the same time.