Dataset Overview
This dataset contains individual-level data from a randomized controlled trial (RCT) conducted in northern Uganda, along with associated satellite imagery. It is designed to investigate how treatment effects may vary across different geographical and contextual settings by leveraging both tabular and image-based variables.
Motivation and Content
- Researchers often wish to explore treatment effect heterogeneity, especially in studies focused on global poverty. Traditional variables—such as age and ethnicity—are typically collected near the time of data gathering and may overlook broader environmental, historical, or neighborhood-specific factors.
- Incorporating satellite images into causal inference analyses provides a valuable window into such contextual factors. This dataset exemplifies how researchers can combine tabular data (e.g., demographic variables, outcomes, treatment indicators) with geospatially keyed satellite imagery to model and interpret how treatment effects change across different locations.
Potential Use Cases
- Causal Inference Research: Apply image-based methods to detect and explain geographic or contextual heterogeneity in RCT outcomes.
- Policy Evaluation: Aid policymakers in identifying areas or populations most likely to benefit from poverty-alleviation interventions.
- Methodological Innovations: Serve as a testbed for new models that integrate high-dimensional or unstructured data (images) with standard tabular data in the causal inference setting.
Source
Connor T. Jerzak, Fredrik Johansson, Adel Daoud. Image-based Treatment Effect Heterogeneity. Proceedings of the Second Conference on Causal Learning and Reasoning (CLeaR), Proceedings of Machine Learning Research (PMLR), 213: 531-552, 2023.