Testing Independence under Biased Sampling

12 Dec 2019  ·  Yaniv Tenzer, Micha Mandel, Or Zuk ·

Testing for association or dependence between pairs of random variables is a fundamental problem in statistics. In some applications, data are subject to selection bias that causes dependence between observations even when it is absent from the population. An important example is truncation models, in which observed pairs are restricted to a specific subset of the X-Y plane. Standard tests for independence are not suitable in such cases, and alternative tests that take the selection bias into account are required. To deal with this issue, we generalize the notion of quasi-independence with respect to the sampling mechanism, and study the problem of detecting any deviations from it. We develop a test motivated by the classic Hoeffding's statistic, and use two approaches to compute its distribution under the null: (i) a bootstrap-based approach and (ii) an exact permutation-test with non-uniform probability of permutations. We prove the validity of the tests, and show, using simulations, that they perform very well for important special cases of the problem and achieve improved power compared to competing methods. The tests are applied to four datasets, two that are subject to truncation, one that is subject to length bias and one with a special bias mechanism.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper