Token Classification

29 papers with code • 213 benchmarks • 28 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

WangchanBERTa: Pretraining transformer-based Thai Language Models

vistec-AI/thai2transformers 24 Jan 2021

However, for a relatively low-resource language such as Thai, the choices of models are limited to training a BERT-based model based on a much smaller dataset or finetuning multi-lingual models, both of which yield suboptimal downstream performance.

Detecting Label Errors in Token Classification Data

cleanlab/cleanlab 8 Oct 2022

Mislabeled examples are a common issue in real-world data, particularly for tasks like token classification where many labels must be chosen on a fine-grained basis.

Label Supervised LLaMA Finetuning

4ai/ls-llama 2 Oct 2023

We evaluate this approach with Label Supervised LLaMA (LS-LLaMA), based on LLaMA-2-7B, a relatively small-scale LLM, and can be finetuned on a single GeForce RTX4090 GPU.

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

samuelbroscheit/entity_knowledge_in_bert CONLL 2019

We show on an entity linking benchmark that (i) this model improves the entity representations over plain BERT, (ii) that it outperforms entity linking architectures that optimize the tasks separately and (iii) that it only comes second to the current state-of-the-art that does mention detection and entity disambiguation jointly.

Common-Knowledge Concept Recognition for SEVA

jitinkrishnan/NASA-SE 26 Mar 2020

We build a common-knowledge concept recognition system for a Systems Engineer's Virtual Assistant (SEVA) which can be used for downstream tasks such as relation extraction, knowledge graph construction, and question-answering.

Counterfactual Detection meets Transfer Learning

Kc2fresh/Extracting-Counterfactual-data 27 May 2020

We can consider Counterfactuals as belonging in the domain of Discourse structure and semantics, A core area in Natural Language Understanding and in this paper, we introduce an approach to resolving counterfactual detection as well as the indexing of the antecedents and consequents of Counterfactual statements.

On Long-Tailed Phenomena in Neural Machine Translation

vyraun/long-tailed Findings of the Association for Computational Linguistics 2020

State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens, tackling which remains a major challenge.

Indic-Transformers: An Analysis of Transformer Language Models for Indian Languages

Neural-Space/indic-transformers 4 Nov 2020

Language models based on the Transformer architecture have achieved state-of-the-art performance on a wide range of NLP tasks such as text classification, question-answering, and token classification.

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups

allenai/VILA 1 Jun 2021

Experiments are conducted on a newly curated evaluation suite, S2-VLUE, that unifies existing automatically-labeled datasets and includes a new dataset of manual annotations covering diverse papers from 19 scientific disciplines.