Part-Of-Speech Tagging

214 papers with code • 17 benchmarks • 26 datasets

Part-of-speech tagging (POS tagging) is the task of tagging a word in a text with its part of speech. A part of speech is a category of words with similar grammatical properties. Common English parts of speech are noun, verb, adjective, adverb, pronoun, preposition, conjunction, etc.

Example:

Vinken , 61 years old
NNP , CD NNS JJ

Libraries

Use these libraries to find Part-Of-Speech Tagging models and implementations
2 papers
1,877

Latest papers with no code

A Morphology-Based Investigation of Positional Encodings

no code yet • 6 Apr 2024

How does the importance of positional encoding in pre-trained language models (PLMs) vary across languages with different morphological complexity?

ZAEBUC-Spoken: A Multilingual Multidialectal Arabic-English Speech Corpus

no code yet • 27 Mar 2024

We present ZAEBUC-Spoken, a multilingual multidialectal Arabic-English speech corpus.

The Comparison of Translationese in Machine Translation and Human Transation in terms of Translation Relations

no code yet • 27 Mar 2024

This study explores the distinctions between neural machine translation (NMT) and human translation (HT) through the lens of translation relations.

NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems

no code yet • 7 Mar 2024

Aware of the shortcomings of existing NLPre evaluation approaches, we investigate a novel method of reliable and fair evaluation and performance reporting.

Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5

no code yet • 4 Mar 2024

The VocaTT (vocabulary teaching and training) engine is written in Python and comprises three basic steps: pre-processing target word lists, generating sentences and candidate word options using GPT, and finally selecting suitable word options.

VNLP: Turkish NLP Package

no code yet • 2 Mar 2024

In this work, we present VNLP: the first dedicated, complete, open-source, well-documented, lightweight, production-ready, state-of-the-art Natural Language Processing (NLP) package for the Turkish language.

Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models

no code yet • 28 Feb 2024

Despite the predominance of English in their training data, English-centric Large Language Models (LLMs) like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks, raising questions about the depth and nature of their cross-lingual capabilities.

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

no code yet • 21 Feb 2024

To address this challenge, we propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks.

Punctuation Restoration Improves Structure Understanding without Supervision

no code yet • 13 Feb 2024

Unsupervised learning objectives like language modeling and de-noising constitute a significant part in producing pre-trained models that perform various downstream applications from natural language understanding to conversational tasks.

A Comprehensive View of the Biases of Toxicity and Sentiment Analysis Methods Towards Utterances with African American English Expressions

no code yet • 23 Jan 2024

One explanation for this bias is that AI models are trained on limited datasets, and using such a term in training data is more likely to appear in a toxic utterance.