Text Classification

1106 papers with code • 93 benchmarks • 136 datasets

Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.

In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.

( Image credit: Text Classification Algorithms: A Survey )

Libraries

Use these libraries to find Text Classification models and implementations

Latest papers with no code

Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression

no code yet • 28 Apr 2024

We propose a novel, lightweight supervised dictionary learning framework for text classification based on data compression and representation.

GuideWalk -- Heterogeneous Data Fusion for Enhanced Learning -- A Multiclass Document Classification Case

no code yet • 25 Apr 2024

The success of the proposed embedding method is tested in classification problems.

Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case

no code yet • 23 Apr 2024

To ensure the quality of water, different methods for monitoring and assessing the water networks, such as offline and online surveys, are used.

Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment

no code yet • 19 Apr 2024

This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism.

When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes

no code yet • 18 Apr 2024

We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes.

AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts

no code yet • 17 Apr 2024

Cognitive Behavioral Therapy (CBT) is an effective technique for addressing the irrational thoughts stemming from mental illnesses, but it necessitates precise identification of cognitive pathways to be successfully implemented in patient care.

A Novel ICD Coding Framework Based on Associated and Hierarchical Code Description Distillation

no code yet • 17 Apr 2024

To address these problems, we propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment. we utilize the code description and the hierarchical structure inherent to the ICD codes.

Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification

no code yet • 17 Apr 2024

This study is part of the debate on the efficiency of large versus small language models for text classification by prompting. We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models. Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions.

Incubating Text Classifiers Following User Instruction with Nothing but LLM

no code yet • 16 Apr 2024

In this paper, we aim to generate text classification data given arbitrary class definitions (i. e., user instruction), so one can train a small text classifier without any human annotation or raw corpus.

Empowering Interdisciplinary Research with BERT-Based Models: An Approach Through SciBERT-CNN with Topic Modeling

no code yet • 16 Apr 2024

Researchers must stay current in their fields by regularly reviewing academic literature, a task complicated by the daily publication of thousands of papers.