Short Text Clustering

14 papers with code • 8 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Discovering New Intents with Deep Aligned Clustering

thuiar/DeepAligned-Clustering 16 Dec 2020

In this work, we propose an effective method, Deep Aligned Clustering, to discover new intents with the aid of the limited known intent data.

Supporting Clustering with Contrastive Learning

amazon-research/sccl NAACL 2021

Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space.

Twin Contrastive Learning for Online Clustering

XLearning-SCU/2022-IJCV-TCL 21 Oct 2022

Specifically, we find that when the data is projected into a feature space with a dimensionality of the target cluster number, the rows and columns of its feature matrix correspond to the instance and cluster representation, respectively.

Self-Taught Convolutional Neural Networks for Short Text Clustering

jacoxu/STC2 1 Jan 2017

Short text clustering is a challenging problem due to its sparseness of text representation.

A Self-Training Approach for Short Text Clustering

hadifar/stc_clustering WS 2019

Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF representations, since these lead to sparse vector representations of the short texts.

Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement

thuiar/CDAC-plus 20 Nov 2019

Identifying new user intents is an essential task in the dialogue system.

Enhancement of Short Text Clustering by Iterative Classification

rashadulrakib/short-text-clustering-enhancement 31 Jan 2020

Short text clustering is a challenging task due to the lack of signal contained in such short texts.

Intent Mining from past conversations for conversational agent

ajaychatterjee/IntentMining COLING 2020

In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations.

Efficient Sparse Spherical k-Means for Document Clustering

johpro/esp-kmeans 30 Jul 2021

Spherical k-Means is frequently used to cluster document collections because it performs reasonably well in many settings and is computationally efficient.