Text Classification

500 papers with code • 38 benchmarks • 58 datasets

Text classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

( Image credit: Text Classification Algorithms: A Survey )

Greatest papers with code

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

tensorflow/models NAACL 2019

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.

Common Sense Reasoning Conversational Response Selection +6

Big Bird: Transformers for Longer Sequences

tensorflow/models NeurIPS 2020

To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear.

 Ranked #1 on Question Answering on Natural Questions (F1 (Long) metric)

Linguistic Acceptability Natural Language Inference +3

Adversarial Training Methods for Semi-Supervised Text Classification

tensorflow/models 25 May 2016

Adversarial training provides a means of regularizing supervised learning algorithms while virtual adversarial training is able to extend supervised learning algorithms to the semi-supervised setting.

General Classification Semi Supervised Text Classification +3

Semi-supervised Sequence Learning

tensorflow/models NeurIPS 2015

In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better.

Language Modelling Text Classification

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

huggingface/transformers NeurIPS 2020

With the success of language pretraining, it is highly desirable to develop more efficient architectures of good scalability that can exploit the abundant unlabeled data at a lower cost.

Reading Comprehension Text Classification

FlauBERT: Unsupervised Language Model Pre-training for French

huggingface/transformers LREC 2020

Language models have become a key step to achieve state-of-the art results in many different Natural Language Processing (NLP) tasks.

Language Modelling Natural Language Inference +2

XLNet: Generalized Autoregressive Pretraining for Language Understanding

huggingface/transformers NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Document Ranking Humor Detection +7

FastText.zip: Compressing text classification models

facebookresearch/fastText 12 Dec 2016

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.

General Classification Quantization +2

Universal Language Model Fine-tuning for Text Classification

fastai/fastai ACL 2018

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.

General Classification Language Modelling +3