Principled, practical, flexible, fast: a new approach to phylogenetic factor analysis

Biological phenotypes are products of complex evolutionary processes in which selective forces influence multiple biological trait measurements in unknown ways. Phylogenetic factor analysis disentangles these relationships across the evolutionary history of a group of organisms. Scientists seeking to employ this modeling framework confront numerous modeling and implementation decisions, the details of which pose computational and replicability challenges. General and impactful community employment requires a data scientific analysis plan that balances flexibility, speed and ease of use, while minimizing model and algorithm tuning. Even in the presence of non-trivial phylogenetic model constraints, we show that one may analytically address latent factor uncertainty in a way that (a) aids model flexibility, (b) accelerates computation (by as much as 500-fold) and (c) decreases required tuning. We further present practical guidance on inference and modeling decisions as well as diagnosing and solving common problems in these analyses. We codify this analysis plan in an automated pipeline that distills the potentially overwhelming array of modeling decisions into a small handful of (typically binary) choices. We demonstrate the utility of these methods and analysis plan in four real-world problems of varying scales.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here