Recursive partitioning and multi-scale modeling on conditional densities

14 Nov 2016  ·  Li Ma ·

We introduce a nonparametric prior on the conditional distribution of a (univariate or multivariate) response given a set of predictors. The prior is constructed in the form of a two-stage generative procedure, which in the first stage recursively partitions the predictor space, and then in the second stage generates the conditional distribution by a multi-scale nonparametric density model on each predictor partition block generated in the first stage. This design allows adaptive smoothing on both the predictor space and the response space, and it results in the full posterior conjugacy of the model, allowing exact Bayesian inference to be completed analytically through a forward-backward recursive algorithm without the need of MCMC, and thus enjoying high computational efficiency (scaling linearly with the sample size). We show that this prior enjoys desirable theoretical properties such as full $L_1$ support and posterior consistency. We illustrate how to apply the model to a variety of inference problems such as conditional density estimation as well as hypothesis testing and model selection in a manner similar to applying a parametric conjugate prior, while attaining full nonparametricity. Also provided is a comparison to two other state-of-the-art Bayesian nonparametric models for conditional densities in both model fit and computational time. A real data example from flow cytometry containing 455,472 observations is given to illustrate the substantial computational efficiency of our method and its application to multivariate problems.

PDF Abstract

Categories


Methodology Statistics Theory Computation Statistics Theory

Datasets


  Add Datasets introduced or used in this paper