The problem of estimating the kernel mean in a reproducing kernel Hilbert
space (RKHS) is central to kernel methods in that it is used by classical
approaches (e.g., when centering a kernel PCA matrix), and it also forms the
core inference step of modern kernel methods (e.g., kernel-based non-parametric
tests) that rely on embedding probability distributions in RKHSs. Muandet et
al. (2014) has shown that shrinkage can help in constructing "better"
estimators of the kernel mean than the empirical estimator...
The present paper
studies the consistency and admissibility of the estimators in Muandet et al.
(2014), and proposes a wider class of shrinkage estimators that improve upon
the empirical estimator by considering appropriate basis functions. Using the
kernel PCA basis, we show that some of these estimators can be constructed
using spectral filtering algorithms which are shown to be consistent under some
technical assumptions. Our theoretical analysis also reveals a fundamental
connection to the kernel-based supervised learning framework. The proposed
estimators are simple to implement and perform well in practice.