Part-Of-Speech Tagging

214 papers with code • 15 benchmarks • 26 datasets

Part-of-speech tagging (POS tagging) is the task of tagging a word in a text with its part of speech. A part of speech is a category of words with similar grammatical properties. Common English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, etc.

Example:

Vinken	,	61	years	old
NNP	,	CD	NNS	JJ

Benchmarks

Add a Result

These leaderboards are used to track progress in Part-Of-Speech Tagging

Dataset	Best Model	Compare
Penn Treebank	SALE-BART encoder	See all
UD	BiLSTM-LAN	See all
Ritter	ACE	See all
Social media	PretRand	See all
ARK	ACE	See all
Tweebank	ACE	See all
UD2.5 test	Trankit	See all
French GSD	CamemBERT	See all
Sequoia Treebank	CamemBERT	See all
Spoken Corpus	CamemBERT	See all
ParTUT	CamemBERT	See all
DaNE	da_dacy_large_tft-0.0.0	See all
XGLUE	mGPT	See all
ANTILLES	Bi-LSTM-CRF + Flair Embeddings + CamemBERT (oscar−138gb−base) Embeddings	See all
Morphosyntactic-analysis-dataset	MyBert	See all

Show all 15 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Part-Of-Speech Tagging models and implementations

jiesutd/NCRFpp

2 papers

1,878

jiesutd/PyTorchSeqLabel

2 papers

1,878

Datasets

Subtasks

Unsupervised Part-Of-Speech Tagging

Most implemented papers

Most implemented Social Latest No code

Morphosyntactic Tagging with a Meta-BiLSTM Model over Context Sensitive Token Encodings

google/meta_tagger • • ACL 2018

In this paper, we investigate models that use recurrent neural networks with sentence-level context for initial character and word-based representations.

Paper
Code

Chinese Lexical Analysis with Deep Bi-GRU-CRF Network

baidu/lac • • 5 Jul 2018

Lexical analysis is believed to be a crucial step towards natural language understanding and has been widely studied.

Paper
Code

LemmaTag: Jointly Tagging and Lemmatizing for Morphologically-Rich Languages with BRNNs

hyperparticle/LemmaTag • • 10 Aug 2018

We present LemmaTag, a featureless neural network architecture that jointly generates part-of-speech tags and lemmas for sentences by using bidirectional RNNs with character-level and word-level embeddings.

Paper
Code

From POS tagging to dependency parsing for biomedical event extraction

datquocnguyen/BioNLP • • 11 Aug 2018

Results: We perform an empirical study comparing state-of-the-art traditional feature-based and neural network-based models for two core natural language processing tasks of part-of-speech (POS) tagging and dependency parsing on two benchmark biomedical corpora, GENIA and CRAFT.

Paper
Code

Semi-Supervised Sequence Modeling with Cross-View Training

tensorflow/models • • EMNLP 2018

We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data.

Paper
Code

Glyce: Glyph-vectors for Chinese Character Representations

ShannonAI/glyce • • NeurIPS 2019

However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found.

Paper
Code

Data Augmentation via Dependency Tree Morphing for Low-Resource Languages

gozdesahin/crop-rotate-augment • EMNLP 2018

Neural NLP systems achieve high scores in the presence of sizable training dataset.

Paper
Code

OmniNet: A unified architecture for multi-modal multi-task learning

subho406/OmniNet • • 17 Jul 2019

We also show that using this neural network pre-trained on some modalities assists in learning unseen tasks such as video captioning and video question answering.

Paper
Code

Hierarchically-Refined Label Attention Network for Sequence Labeling

Nealcly/LAN • • IJCNLP 2019

CRF has been used as a powerful model for statistical sequence labeling.

Paper
Code

Dice Loss for Data-imbalanced NLP Tasks

ShannonAI/dice_loss_for_NLP • • ACL 2020

Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training.

Paper
Code

Part-Of-Speech Tagging

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result