Efficient L1-Norm Principal-Component Analysis via Bit Flipping

6 Oct 2016  ·  Panos P. Markopoulos, Sandipan Kundu, Shubham Chamadia, Dimitris A. Pados ·

It was shown recently that the $K$ L1-norm principal components (L1-PCs) of a real-valued data matrix $\mathbf X \in \mathbb R^{D \times N}$ ($N$ data samples of $D$ dimensions) can be exactly calculated with cost $\mathcal{O}(2^{NK})$ or, when advantageous, $\mathcal{O}(N^{dK - K + 1})$ where $d=\mathrm{rank}(\mathbf X)$, $K<d$ [1],[2]. In applications where $\mathbf X$ is large (e.g., "big" data of large $N$ and/or "heavy" data of large $d$), these costs are prohibitive. In this work, we present a novel suboptimal algorithm for the calculation of the $K < d$ L1-PCs of $\mathbf X$ of cost $\mathcal O(ND \mathrm{min} \{ N,D\} + N^2(K^4 + dK^2) + dNK^3)$, which is comparable to that of standard (L2-norm) PC analysis. Our theoretical and experimental studies show that the proposed algorithm calculates the exact optimal L1-PCs with high frequency and achieves higher value in the L1-PC optimization metric than any known alternative algorithm of comparable computational cost. The superiority of the calculated L1-PCs over standard L2-PCs (singular vectors) in characterizing potentially faulty data/measurements is demonstrated with experiments on data dimensionality reduction and disease diagnosis from genomic data.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here