Asymptotic Normality for Multivariate Random Forest Estimators

7 Dec 2020  ·  Kevin Li ·

Regression trees and random forests are popular and effective non-parametric estimators in practical applications. A recent paper by Athey and Wager shows that the random forest estimate at any point is asymptotically Gaussian; in this paper, we extend this result to the multivariate case and show that the vector of estimates at multiple points is jointly normal. Specifically, the covariance matrix of the limiting normal distribution is diagonal, so that the estimates at any two points are independent in sufficiently deep trees. Moreover, the off-diagonal term is bounded by quantities capturing how likely two points belong to the same partition of the resulting tree. Our results relies on certain a certain stability property when constructing splits, and we give examples of splitting rules for which this assumption is and is not satisfied. We test our proposed covariance bound and the associated coverage rates of confidence intervals in numerical simulations.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here