Generalised Boosted Forests

24 Feb 2021  ·  Indrayudh Ghosal, Giles Hooker ·

This paper extends recent work on boosting random forests to model non-Gaussian responses. Given an exponential family $\mathbb{E}[Y|X] = g^{-1}(f(X))$ our goal is to obtain an estimate for $f$. We start with an MLE-type estimate in the link space and then define generalised residuals from it. We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest. We call the sum of these three estimators a \textit{generalised boosted forest}. We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric. We also provide a variance estimator, which we can obtain with the same computational cost as the original estimate itself. Empirical experiments on real-world data and simulations demonstrate that the methods can effectively reduce bias, and that confidence interval coverage is conservative in the bulk of the covariate distribution.

PDF Abstract
No code implementations yet. Submit your code now



  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here