Text Classification

1101 papers with code • 150 benchmarks • 148 datasets

Text Classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

Text Classification problems include emotion classification, news classification, citation intent classification, among others. Benchmark datasets for evaluating text classification capabilities include GLUE, AGNews, among others.

In recent years, deep learning techniques like XLNet and RoBERTa have attained some of the biggest performance jumps for text classification problems.

( Image credit: Text Classification Algorithms: A Survey )

Libraries

Use these libraries to find Text Classification models and implementations

Most implemented papers

Universal Sentence Encoder

facebookresearch/InferSent 29 Mar 2018

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

zihangdai/xlnet NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Unsupervised Data Augmentation for Consistency Training

google-research/uda NeurIPS 2020

In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.

RoFormer: Enhanced Transformer with Rotary Position Embedding

ZhuiyiTechnology/roformer 20 Apr 2021

Then, we propose a novel method named Rotary Position Embedding(RoPE) to effectively leverage the positional information.

How to Fine-Tune BERT for Text Classification?

xuyige/BERT4doc-Classification 14 May 2019

Language model pre-training has proven to be useful in learning universal language representations.

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

jasonwei20/eda_nlp IJCNLP 2019

We present EDA: easy data augmentation techniques for boosting performance on text classification tasks.

Parameter-Efficient Transfer Learning for NLP

google-research/adapter-bert 2 Feb 2019

On GLUE, we attain within 0. 4% of the performance of full fine-tuning, adding only 3. 6% parameters per task.

FNet: Mixing Tokens with Fourier Transforms

google-research/google-research NAACL 2022

At longer input lengths, our FNet model is significantly faster: when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, while outpacing the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs).

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

DongjunLee/dmn-tensorflow 24 Jun 2015

Most tasks in natural language processing can be cast into question answering (QA) problems over language input.

Simple Recurrent Units for Highly Parallelizable Recurrence

asappresearch/sru EMNLP 2018

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.