Impact of signal-to-noise ratio and bandwidth on graph Laplacian spectrum from high-dimensional noisy point cloud
We systematically study the spectrum of kernel-based graph Laplacian (GL) constructed from high-dimensional and noisy random point cloud in the nonnull setup. The problem is motived by studying the model when the clean signal is sampled from a manifold that is embedded in a low-dimensional Euclidean subspace, and corrupted by high-dimensional noise. We quantify how the signal and noise interact over different regions of signal-to-noise ratio (SNR), and report the resulting peculiar spectral behavior of GL. In addition, we explore the impact of chosen kernel bandwidth on the spectrum of GL over different regions of SNR, which lead to an adaptive choice of kernel bandwidth that coincides with the common practice in real data. This result paves the way to a theoretical understanding of how practitioners apply GL when the dataset is noisy.
PDF Abstract