Document Classification

169 papers with code • 19 benchmarks • 12 datasets

Document Classification is a procedure of assigning one or more labels to a document from a predetermined set of labels.

Source: Long-length Legal Document Classification

Libraries

Use these libraries to find Document Classification models and implementations
2 papers
115
2 papers
14

Most implemented papers

Graph Attention Networks

PetarV-/GAT ICLR 2018

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations.

Semi-Supervised Classification with Graph Convolutional Networks

dmlc/dgl 9 Sep 2016

We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs.

Revisiting Semi-Supervised Learning with Graph Embeddings

tkipf/gcn 29 Mar 2016

We present a semi-supervised learning framework based on graph embeddings.

On Calibration of Modern Neural Networks

gpleiss/temperature_scaling ICML 2017

Confidence calibration -- the problem of predicting probability estimates representative of the true correctness likelihood -- is important for classification models in many applications.

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

facebookresearch/LASER TACL 2019

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.

ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations

sinovation/ZEN Findings of the Association for Computational Linguistics 2020

Moreover, it is shown that reasonable performance can be obtained when ZEN is trained on a small corpus, which is important for applying pre-training techniques to scenarios with limited data.

Improving Language Understanding by Generative Pre-Training

huggingface/transformers Preprint 2018

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

SPECTER: Document-level Representation Learning using Citation-informed Transformers

allenai/specter ACL 2020

We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph.

Geometric deep learning on graphs and manifolds using mixture model CNNs

dmlc/dgl CVPR 2017

Recently, there has been an increasing interest in geometric deep learning, attempting to generalize deep learning methods to non-Euclidean structured data such as graphs and manifolds, with a variety of applications from the domains of network analysis, computational social science, or computer graphics.

Learning to Skim Text

tsujuifu/pytorch_lstm-shuttle ACL 2017

Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering.