Text Categorization

34 papers with code • 0 benchmarks • 3 datasets

Text Categorization is the task of automatically assigning pre-defined categories to documents written in natural languages. Several types of Text Categorization have been studied, each of which deals with different types of documents and categories, such as topic categorization to detect discussed topics (e.g., sports, politics), spam detection, and sentiment classification to determine the sentiment typically in product or movie reviews.

Source: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Greatest papers with code

Automatically Annotated Turkish Corpus for Named Entity Recognition and Text Categorization using Large-Scale Gazetteers

juand-r/entity-recognition-datasets 8 Feb 2017

Turkish Wikipedia Named-Entity Recognition and Text Categorization (TWNERTC) dataset is a collection of automatically categorized and annotated sentences obtained from Wikipedia.

Named Entity Recognition NER +1

PySS3: A Python package implementing a novel text classifier with visualization tools for Explainable AI

sergioburdisso/pyss3 19 Dec 2019

A recently introduced text classifier, called SS3, has obtained state-of-the-art performance on the CLEF's eRisk tasks.

Classification Document Classification +4

t-SS3: a text classifier with dynamic n-grams for early risk detection over text streams

sergioburdisso/pyss3 11 Nov 2019

SS3 was created to deal with ERD problems naturally since: it supports incremental training and classification over text streams, and it can visually explain its rationale.

Anorexia Detection Classification +5

Inverse-Category-Frequency based supervised term weighting scheme for text categorization

zveryansky/textvec 13 Dec 2010

Term weighting schemes often dominate the performance of many classifiers, such as kNN, centroid-based classifier and SVMs.

Information Retrieval Multi-class Classification +1

Massively Multilingual Word Embeddings

idiap/mhan 5 Feb 2016

We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.

Multilingual Word Embeddings Text Categorization

Using the Tsetlin Machine to Learn Human-Interpretable Rules for High-Accuracy Text Categorization with Medical Applications

cair/TextUnderstandingTsetlinMachine 12 Sep 2018

The Tsetlin Machine either performs on par with or outperforms all of the evaluated methods on both the 20 Newsgroups and IMDb datasets, as well as on a non-public clinical dataset.

Language understanding Natural Language Understanding +1

Learning Graph Pooling and Hybrid Convolutional Operations for Text Representations

HongyangGao/hConv-gPool-Net 21 Jan 2019

Another limitation of GCN when used on graph-based text representation tasks is that, GCNs do not consider the order information of nodes in graph.

Text Categorization

Structure-Aware Convolutional Neural Networks

vector-1127/SACNNs NeurIPS 2018

Convolutional neural networks (CNNs) are inherently subject to invariable filters that can only aggregate local inputs with the same topological structures.

Action Recognition Activity Detection +3