Controlling for sparsity in sparse factor analysis models: adaptive latent feature sharing for piecewise linear dimensionality reduction

22 Jun 2020  ·  Adam Farooq, Yordan P. Raykov, Petar Raykov, Max A. Little ·

Ubiquitous linear Gaussian exploratory tools such as principle component analysis (PCA) and factor analysis (FA) remain widely used as tools for: exploratory analysis, pre-processing, data visualization and related tasks. However, due to their rigid assumptions including crowding of high dimensional data, they have been replaced in many settings by more flexible and still interpretable latent feature models. The Feature allocation is usually modelled using discrete latent variables assumed to follow either parametric Beta-Bernoulli distribution or Bayesian nonparametric prior. In this work we propose a simple and tractable parametric feature allocation model which can address key limitations of current latent feature decomposition techniques. The new framework allows for explicit control over the number of features used to express each point and enables a more flexible set of allocation distributions including feature allocations with different sparsity levels. This approach is used to derive a novel adaptive Factor analysis (aFA), as well as, an adaptive probabilistic principle component analysis (aPPCA) capable of flexible structure discovery and dimensionality reduction in a wide case of scenarios. We derive both standard Gibbs sampler, as well as, an expectation-maximization inference algorithms that converge orders of magnitude faster to a reasonable point estimate solution. The utility of the proposed aPPCA model is demonstrated for standard PCA tasks such as feature learning, data visualization and data whitening. We show that aPPCA and aFA can infer interpretable high level features both when applied on raw MNIST and when applied for interpreting autoencoder features. We also demonstrate an application of the aPPCA to more robust blind source separation for functional magnetic resonance imaging (fMRI).

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods