Short Text Clustering

7 papers with code • 8 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?


Greatest papers with code

Supporting Clustering with Contrastive Learning

makcedward/nlpaug NAACL 2021

Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space.

Contrastive Learning Short Text Clustering

A Self-Training Approach for Short Text Clustering

hadifar/stc_clustering WS 2019

Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF representations, since these lead to sparse vector representations of the short texts.

Deep Clustering Sentence Embedding +1

DECAF: Deep Extreme Classification with Label Features

Extreme-classification/DECAF 1 Aug 2021

This paper develops the DECAF algorithm that addresses these challenges by learning models enriched by label metadata that jointly learn model parameters and feature representations using deep networks and offer accurate classification at the scale of millions of labels.

Classification Extreme Multi-Label Classification +6

ECLARE: Extreme Classification with Label Graph Correlations

Extreme-classification/ECLARE 31 Jul 2021

This paper presents ECLARE, a scalable deep learning architecture that incorporates not only label text, but also label correlations, to offer accurate real-time predictions within a few milliseconds.

Classification Extreme Multi-Label Classification +6

Intent Mining from past conversations for conversational agent

ajaychatterjee/IntentMining COLING 2020

In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations.

Intent Discovery Short Text Clustering