Linear and Kernel Classification in the Streaming Model: Improved Bounds for Heavy Hitters
We study linear and kernel classification in the streaming model. For linear classification, we improve upon the algorithm of (Tai, et al. 2018), which solves the $\ell_1$ point query problem on the optimal weight vector $w_* \in \mathbb{R}^d$ in sublinear space. We first give an algorithm solving the more difficult $\ell_2$ point query problem on $w_*$, also in sublinear space. We also give an algorithm which solves the $\ell_2$ heavy hitter problem on $w_*$, in sublinear space and running time. Finally, we give an algorithm which can $\textit{deterministically}$ solve the $\ell_1$ point query problem on $w_*$, with sublinear space improving upon that of (Tai, et al. 2018). For kernel classification, if $w_* \in \mathbb{R}^{d^p}$ is the optimal weight vector classifying points in the stream according to their $p^{th}$-degree polynomial kernel, then we give an algorithm solving the $\ell_2$ point query problem on $w_*$ in $\text{poly}(\frac{p \log d}{\varepsilon})$ space, and an algorithm solving the $\ell_2$ heavy hitter problem in $\text{poly}(\frac{p \log d}{\varepsilon})$ space and running time. Note that our space and running time are polynomial in $p$, making our algorithms well-suited to high-degree polynomial kernels and the Gaussian kernel (approximated by the polynomial kernel of degree $p = \Theta(\log T)$).
PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract