Vector Quantization (k-means problem)

5 papers with code • 0 benchmarks • 0 datasets

Given a data set $X$ of d-dimensional numeric vectors and a number $k$ find a codebook $C$ of $k$ d-dimensional vectors such that the sum of square distances of each $x \in X$ to the respective nearest $c \in C$ is as small as possible. This is also known as the k-means problem and is known to be NP-hard.

Libraries

Use these libraries to find Vector Quantization (k-means problem) models and implementations
3 papers
768

Most implemented papers

Fast K-Means with Accurate Bounds

idiap/eakmeans 8 Feb 2016

We propose a novel accelerated exact k-means algorithm, which performs better than the current state-of-the-art low-dimensional algorithm in 18 of 22 experiments, running up to 3 times faster.

Breathing K-Means

gittar/breathing-k-means 28 Jun 2020

For larger values of m, e. g., m = 20, breathing k-means likely is the new SOTA for the k-means problem.

Learning the k in k-means

elki-project/elki NeurIPS 2003

The G-means algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution.

The Effect of Points Dispersion on the $k$-nn Search in Random Projection Forests

mashaan14/RPTree 25 Feb 2023

$k$-nn search in an rpForest is influenced by two factors: 1) the dispersion of points along the random direction and 2) the number of rpTrees in the rpForest.

Data Aggregation for Hierarchical Clustering

elki-project/elki 5 Sep 2023

Hierarchical Agglomerative Clustering (HAC) is likely the earliest and most flexible clustering method, because it can be used with many distances, similarities, and various linkage strategies.