Panel Data with Unknown Clusters
Clustered standard errors and approximate randomization tests are popular inference methods that allow for dependence within observations. However, they require researchers to know the cluster structure ex ante. We propose a procedure to help researchers discover clusters in panel data. Our method is based on thresholding an estimated long-run variance-covariance matrix and requires the panel to be large in the time dimension, but imposes no lower bound on the number of units. We show that our procedure recovers the true clusters with high probability with no assumptions on the cluster structure. The estimated clusters are independently of interest, but they can also be used in the approximate randomization tests or with conventional cluster-robust covariance estimators. The resulting procedures control size and have good power.
PDF Abstract