Bayesian hierarchical models can infer interpretable predictions of leaf area index from heterogeneous datasets

Environmental scientists often have to predict a complex phenomenon from a heterogeneous collection of datasets. This is particularly challenging if there are systematic differences between them, as is often the case. Accounting for these differences requires a larger number of parameters and thus increases the risk of overfitting. We investigate how Bayesian hierarchical models can help mitigate this problem by allowing the practitioner to explicitly incorporate information about the dataset structure and general domain knowledge. To this end, we look at a typical application in remote sensing: the estimation of leaf area index (of white winter wheat), an important indicator for agronomical modeling, from measurements of reflectance spectra collected at different locations and growth stages. Since the insights gained from such a model could be used to inform policy or business decisions, the interpretability of the model is a primary concern. We, therefore, focus on models that capture the association between leaf area index and the spectral reflectance at various wavelengths by spline-based kernel functions, which can be visually inspected and analyzed. We compare models with three different levels of hierarchy: a non-hierarchical baseline model, a model with hierarchical bias parameter, and a model in which bias and kernel parameters are hierarchically structured. We analyze them using Markov Chain Monte Carlo sampling diagnostics and an intervention-based measure of feature importance. The improved robustness and interpretability of this approach lead us to recommend Bayesian hierarchical models as a versatile tool for environmental sciences and beyond, particularly in scenarios where the available data sources are heterogeneous.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here