Nearly Optimal Dynamic $k$-Means Clustering for High-Dimensional Data

1 Feb 2018  ·  Wei Hu, Zhao Song, Lin F. Yang, Peilin Zhong ·

We consider the $k$-means clustering problem in the dynamic streaming setting, where points from a discrete Euclidean space $\{1, 2, \ldots, \Delta\}^d$ can be dynamically inserted to or deleted from the dataset. For this problem, we provide a one-pass coreset construction algorithm using space $\tilde{O}(k\cdot \mathrm{poly}(d, \log\Delta))$, where $k$ is the target number of centers. To our knowledge, this is the first dynamic geometric data stream algorithm for $k$-means using space polynomial in dimension and nearly optimal (linear) in $k$.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here