Limitations of Implicit Bias in Matrix Sensing: Initialization Rank Matters
In matrix sensing, we first numerically identify the sensitivity to the initialization rank as a new limitation of the implicit bias of gradient flow. We will partially quantify this phenomenon mathematically, where we establish that the gradient flow of the empirical risk is implicitly biased towards low-rank outcomes and successfully learns the planted low-rank matrix, provided that the initialization is low-rank and within a specific "capture neighborhood". This capture neighborhood is far larger than the corresponding neighborhood in local refinement results; the former contains all models with zero training error whereas the latter is a small neighborhood of a model with zero test error. These new insights enable us to design an alternative algorithm for matrix sensing that complements the high-rank and near-zero initialization scheme which is predominant in the existing literature.
PDF Abstract