Browse > Natural Language Processing > Text Classification

Text Classification

99 papers with code · Natural Language Processing

Text classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

State-of-the-art leaderboards

Greatest papers with code

Adversarial Training Methods for Semi-Supervised Text Classification

25 May 2016tensorflow/models

Adversarial training provides a means of regularizing supervised learning algorithms while virtual adversarial training is able to extend supervised learning algorithms to the semi-supervised setting. However, both methods require making small perturbations to numerous entries of the input vector, which is inappropriate for sparse high-dimensional inputs such as one-hot word representations.

SENTIMENT ANALYSIS TEXT CLASSIFICATION WORD EMBEDDINGS

Semi-supervised Sequence Learning

NeurIPS 2015 tensorflow/models

The first approach is to predict what comes next in a sequence, which is a conventional language model in natural language processing. In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better.

LANGUAGE MODELLING TEXT CLASSIFICATION

FastText.zip: Compressing text classification models

12 Dec 2016facebookresearch/fastText

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory. After considering different solutions inspired by the hashing literature, we propose a method built upon product quantization to store word embeddings.

TEXT CLASSIFICATION WORD EMBEDDINGS

Bag of Tricks for Efficient Text Classification

EACL 2017 facebookresearch/fastText

This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.

TEXT CLASSIFICATION

StarSpace: Embed All The Things!

12 Sep 2017facebookresearch/ParlAI

We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task.

COLLABORATIVE FILTERING TEXT CLASSIFICATION WORD EMBEDDINGS

PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts

IJCNLP 2017 beamandrew/medical-data

The dataset consists of approximately 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. First, the majority of datasets for sequential short-text classification (i.e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task.

SENTENCE CLASSIFICATION

Revisiting Semi-Supervised Learning with Graph Embeddings

29 Mar 2016tkipf/gcn

We present a semi-supervised learning framework based on graph embeddings. Given a graph between instances, we train an embedding for each instance to jointly predict the class label and the neighborhood context in the graph.

ENTITY EXTRACTION NODE CLASSIFICATION TEXT CLASSIFICATION

The Natural Language Decathlon: Multitask Learning as Question Answering

ICLR 2019 salesforce/decaNLP

Furthermore, we present a new Multitask Question Answering Network (MQAN) jointly learns all tasks in decaNLP without any task-specific modules or parameters in the multitask setting. Though designed for decaNLP, MQAN also achieves state of the art results on the WikiSQL semantic parsing task in the single-task setting.

DOMAIN ADAPTATION MACHINE TRANSLATION NAMED ENTITY RECOGNITION NATURAL LANGUAGE INFERENCE QUESTION ANSWERING RELATION EXTRACTION SEMANTIC PARSING SEMANTIC ROLE LABELING SENTIMENT ANALYSIS TEXT CLASSIFICATION TRANSFER LEARNING

Simple Recurrent Units for Highly Parallelizable Recurrence

EMNLP 2018 taolei87/sru

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and scalability.

MACHINE TRANSLATION QUESTION ANSWERING TEXT CLASSIFICATION

Universal Sentence Encoder

29 Mar 2018facebookresearch/InferSent

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. We find that transfer learning using sentence embeddings tends to outperform word level transfer.

SEMANTIC TEXTUAL SIMILARITY SENTENCE EMBEDDINGS SENTIMENT ANALYSIS SUBJECTIVITY ANALYSIS TEXT CLASSIFICATION TRANSFER LEARNING WORD EMBEDDINGS