Clustering

2809 papers with code • 0 benchmarks • 5 datasets

Clustering is the task of grouping unlabeled data point into disjoint subsets. Each data point is labeled with a single class. The number of classes is not known a priori. The grouping criteria is typically based on the similarity of data points to each other.

Libraries

Use these libraries to find Clustering models and implementations

Most implemented papers

FaceNet: A Unified Embedding for Face Recognition and Clustering

serengil/deepface CVPR 2015

On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99. 63%.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

UKPLab/sentence-transformers IJCNLP 2019

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

Adversarial Autoencoders

eriklindernoren/PyTorch-GAN 18 Nov 2015

In this paper, we propose the "adversarial autoencoder" (AAE), which is a probabilistic autoencoder that uses the recently proposed generative adversarial networks (GAN) to perform variational inference by matching the aggregated posterior of the hidden code vector of the autoencoder with an arbitrary prior distribution.

XGBoost: A Scalable Tree Boosting System

dmlc/xgboost 9 Mar 2016

In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges.

SOLO: Segmenting Objects by Locations

open-mmlab/mmdetection ECCV 2020

We present a new, embarrassingly simple approach to instance segmentation in images.

Unsupervised Deep Embedding for Clustering Analysis

piiswrong/dec 19 Nov 2015

Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms.

Deep Speaker: an End-to-End Neural Speaker Embedding System

philipperemy/deep-speaker 5 May 2017

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.

Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering

boyangumn/DCN ICML 2017

To recover the `clustering-friendly' latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN).

Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering

slim1017/VaDE 16 Nov 2016

In this paper, we propose Variational Deep Embedding (VaDE), a novel unsupervised generative clustering approach within the framework of Variational Auto-Encoder (VAE).

CatBoost: unbiased boosting with categorical features

catboost/catboost NeurIPS 2018

This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit.