Text Classification

1102 papers with code • 150 benchmarks • 147 datasets

Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.

In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.

( Image credit: Text Classification Algorithms: A Survey )

Libraries

Use these libraries to find Text Classification models and implementations

Most implemented papers

Big Bird: Transformers for Longer Sequences

google-research/bigbird NeurIPS 2020

To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear.

A C-LSTM Neural Network for Text Classification

zackhy/TextClassification 27 Nov 2015

In this work, we combine the strengths of both architectures and propose a novel and unified model called C-LSTM for sentence representation and text classification.

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts

Franck-Dernoncourt/pubmed-rct IJCNLP 2017

First, the majority of datasets for sequential short-text classification (i. e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task.

Graph Convolutional Networks for Text Classification

yao8839836/text_gcn 15 Sep 2018

We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus.

Fastformer: Additive Attention Can Be All You Need

wuch15/Fastformer 20 Aug 2021

In this way, Fastformer can achieve effective context modeling with linear complexity.

Simplifying Graph Convolutional Networks

Tiiiger/SGC 19 Feb 2019

Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations.

FlauBERT: Unsupervised Language Model Pre-training for French

getalp/Flaubert LREC 2020

Language models have become a key step to achieve state-of-the art results in many different Natural Language Processing (NLP) tasks.

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

jind11/TextFooler 27 Jul 2019

Machine learning algorithms are often vulnerable to adversarial examples that have imperceptible alterations from the original counterparts but can fool the state-of-the-art models.

Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference

timoschick/pet 21 Jan 2020

Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with "task descriptions" in natural language (e. g., Radford et al., 2019).

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

punyajoy/HateXplain 18 Dec 2020

We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities.