Topic coverage

5 papers with code • 3 benchmarks • 1 datasets

A prevalent use case of topic models is that of topic discovery. However, most of the topic model evaluation methods rely on abstract metrics such as perplexity or topic coherence. The topic coverage approach is to measure the models' performance by matching model-generated topics to a fixed set of reference topics - topics discovered by humans and represented in a machine-readable format. This way, the models are evaluated in the context of their use, by essentially simulating topic modeling in a fixed setting defined by a text collection and a set of reference topics. Reference topics represent a ground truth that can be used to evaluate both topic models and other measures of model performance. This coverage approach enables large-scale automatic evaluation of existing and future topic models.

Benchmarks

Add a Result

These leaderboards are used to track progress in Topic coverage

Dataset	Best Model	Compare
Topic modeling topic coverage dataset - news	PYP	See all
Topic modeling topic coverage dataset - bio	NMF-200	See all
Topic modeling topic coverage dataset	AuCDC	See all

Datasets

Topic modeling topic coverage dataset

Most implemented papers

Most implemented Social Latest No code

A Topic Coverage Approach to Evaluation of Topic Models

dkorenci/topic_coverage • 11 Dec 2020

When topic models are used for discovery of topics in text collections, a question that arises naturally is how well the model-induced topics correspond to topics of interest to the analyst.

Paper
Code

Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

RowitZou/RankAE • • 14 Dec 2020

Automatic chat summarization can help people quickly grasp important information from numerous chat messages.

Paper
Code

AugESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation

thu-coai/augesc • 26 Feb 2022

Applying this approach, we construct AugESC, an augmented dataset for the ESC task, which largely extends the scale and topic coverage of the crowdsourced ESConv corpus.

Paper
Code

MUG: A General Meeting Understanding and Generation Benchmark

alibaba-damo-academy/spokennlp • • 24 Mar 2023

To prompt SLP advancement, we establish a large-scale general Meeting Understanding and Generation Benchmark (MUG) to benchmark the performance of a wide range of SLP tasks, including topic segmentation, topic-level and session-level extractive summarization and topic title generation, keyphrase extraction, and action item detection.

Paper
Code

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

yueyu1030/ReGen • • 18 May 2023

With the development of large language models (LLMs), zero-shot learning has attracted much attention for various NLP tasks.

Paper
Code

Topic coverage

Benchmarks Add a Result

Datasets

Most implemented papers

A Topic Coverage Approach to Evaluation of Topic Models

Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

AugESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation

MUG: A General Meeting Understanding and Generation Benchmark

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

Content

Benchmarks

Add a Result