Text Classification
1106 papers with code • 93 benchmarks • 136 datasets
Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.
Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.
In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.
( Image credit: Text Classification Algorithms: A Survey )
Libraries
Use these libraries to find Text Classification models and implementationsSubtasks
- Topic Models
- Document Classification
- Sentence Classification
- Emotion Classification
- Emotion Classification
- Multi-Label Text Classification
- Few-Shot Text Classification
- Text Categorization
- Semi-Supervised Text Classification
- Coherence Evaluation
- Toxic Comment Classification
- Citation Intent Classification
- Cross-Domain Text Classification
- Unsupervised Text Classification
- Satire Detection
- Hierarchical Text Classification of Blurbs (GermEval 2019)
- Variable Detection
Latest papers with no code
Lightweight Conceptual Dictionary Learning for Text Classification Using Information Compression
We propose a novel, lightweight supervised dictionary learning framework for text classification based on data compression and representation.
GuideWalk -- Heterogeneous Data Fusion for Enhanced Learning -- A Multiclass Document Classification Case
The success of the proposed embedding method is tested in classification problems.
Social Media and Artificial Intelligence for Sustainable Cities and Societies: A Water Quality Analysis Use-case
To ensure the quality of water, different methods for monitoring and assessing the water networks, such as offline and online surveys, are used.
Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment
This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism.
When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes
We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes.
AI-Enhanced Cognitive Behavioral Therapy: Deep Learning and Large Language Models for Extracting Cognitive Pathways from Social Media Texts
Cognitive Behavioral Therapy (CBT) is an effective technique for addressing the irrational thoughts stemming from mental illnesses, but it necessitates precise identification of cognitive pathways to be successfully implemented in patient care.
A Novel ICD Coding Framework Based on Associated and Hierarchical Code Description Distillation
To address these problems, we propose a novel framework based on associated and hierarchical code description distillation (AHDD) for better code representation learning and avoidance of improper code assignment. we utilize the code description and the hierarchical structure inherent to the ICD codes.
Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification
This study is part of the debate on the efficiency of large versus small language models for text classification by prompting. We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models. Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions.
Incubating Text Classifiers Following User Instruction with Nothing but LLM
In this paper, we aim to generate text classification data given arbitrary class definitions (i. e., user instruction), so one can train a small text classifier without any human annotation or raw corpus.
Empowering Interdisciplinary Research with BERT-Based Models: An Approach Through SciBERT-CNN with Topic Modeling
Researchers must stay current in their fields by regularly reviewing academic literature, a task complicated by the daily publication of thousands of papers.