Short Text Clustering

14 papers with code • 8 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

Federated Learning for Short Text Clustering

no code yet • 23 Nov 2023

The robust short text clustering module aims to train an effective short text clustering model with local data in each client.

CEIL: A General Classification-Enhanced Iterative Learning Framework for Text Clustering

no code yet • 20 Apr 2023

To address this issue, we propose CEIL, a novel Classification-Enhanced Iterative Learning framework for short text clustering, which aims at generally promoting the clustering performance by introducing a classification objective to iteratively improve feature representations.

EASE: Entity-Aware Contrastive Learning of Sentence Embedding

no code yet • ACL ARR January 2022

We present EASE, a novel method for learning sentence embeddings via contrastive learning between sentences and their related entities. The advantage of using entity supervision is twofold: (1) entities have been shown to be a strong indicator of text semantics and thus should provide rich training signals for sentence embeddings; (2) entities are defined independently of languages and thus offer useful cross-lingual alignment supervision. We evaluate EASE against other unsupervised models both in monolingual and multilingual settings. We show that EASE exhibits competitive or better performance in English semantic textual similarity (STS) and short text clustering (STC) tasks and it significantly outperforms baseline methods in multilingual settings on a variety of tasks. Our EASE model and newly constructed multilingual STC dataset, MewsC-15, have been made publicly available to catalyze future research on sentence embeddings.

Representation Learning for Short Text Clustering

no code yet • 21 Sep 2021

Effective representation learning is critical for short text clustering due to the sparse, high-dimensional and noise attributes of short text corpus.

Deep Clustering with Measure Propagation

no code yet • 18 Apr 2021

For example, deep embedded clustering (DEC) has greatly improved the unsupervised clustering performance, by using stacked autoencoders for representation learning.

Short Text Clustering with Transformers

no code yet • 31 Jan 2021

Recent techniques for the task of short text clustering often rely on word embeddings as a transfer learning component.

An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering

no code yet • ACL 2020

Clustering short text streams is a challenging task due to its unique properties: infinite length, sparse data representation and cluster evolution.

Attentive Representation Learning with Adversarial Training for Short Text Clustering

no code yet • 8 Dec 2019

Relying on this, the representation learning and clustering for short texts are seamlessly integrated into a unified model.

Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration

no code yet • WS 2018

Text clustering is a powerful technique to detect topics from document corpora, so as to provide information browsing, analysis, and organization.