Connections and Equivalences between the Nyström Method and Sparse Variational Gaussian Processes
We investigate the connections between sparse approximation methods for making kernel methods and Gaussian processes (GPs) scalable to large-scale data, focusing on the Nystr\"om method and the Sparse Variational Gaussian Processes (SVGP). While sparse approximation methods for GPs and kernel methods share some algebraic similarities, the literature lacks a deep understanding of how and why they are related. This may pose an obstacle to the communications between the GP and kernel communities, making it difficult to transfer results from one side to the other. Our motivation is to remove this obstacle, by clarifying the connections between the sparse approximations for GPs and kernel methods. In this work, we study the two popular approaches, the Nystr\"om and SVGP approximations, in the context of a regression problem, and establish various connections and equivalences between them. In particular, we provide an RKHS interpretation of the SVGP approximation, and show that the Evidence Lower Bound of the SVGP contains the objective function of the Nystr\"om approximation, revealing the origin of the algebraic equivalence between the two approaches. We also study recently established convergence results for the SVGP and how they are related to the approximation quality of the Nystr\"om method.
PDF Abstract