Token Classification

45 papers with code • 19 benchmarks • 14 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

WangchanBERTa: Pretraining transformer-based Thai Language Models

vistec-AI/thai2transformers 24 Jan 2021

However, for a relatively low-resource language such as Thai, the choices of models are limited to training a BERT-based model based on a much smaller dataset or finetuning multi-lingual models, both of which yield suboptimal downstream performance.

Detecting Label Errors in Token Classification Data

cleanlab/cleanlab 8 Oct 2022

Mislabeled examples are a common issue in real-world data, particularly for tasks like token classification where many labels must be chosen on a fine-grained basis.

Retrieval Augmented Generation using Engineering Design Knowledge

siddharthl93/engineering-design-knowledge 13 Jul 2023

For this task, we create a dataset of 375, 084 examples and fine-tune language models for relation identification (token classification) and elicitation (sequence-to-sequence).

Label Supervised LLaMA Finetuning

4ai/ls-llama 2 Oct 2023

We evaluate this approach with Label Supervised LLaMA (LS-LLaMA), based on LLaMA-2-7B, a relatively small-scale LLM, and can be finetuned on a single GeForce RTX4090 GPU.

Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction

chongzhangfdu/tpp 17 Oct 2023

However, BIO-tagging scheme relies on the correct order of model inputs, which is not guaranteed in real-world NER on scanned VrDs where text are recognized and arranged by OCR systems.

Embedded Named Entity Recognition using Probing Classifiers

nicpopovic/stoke 18 Mar 2024

Streaming text generation has become a common way of increasing the responsiveness of language model powered applications, such as chat assistants.

POS-tagging to highlight the skeletal structure of sentences

disk0Dancer/rubert-finetuned-pos 21 Nov 2024

This study presents the development of a part-of-speech (POS) tagging model to extract the skeletal structure of sentences using transfer learning with the BERT architecture for token classification.

LettuceDetect: A Hallucination Detection Framework for RAG Applications

KRLabsOrg/LettuceDetect 24 Feb 2025

Retrieval Augmented Generation (RAG) systems remain vulnerable to hallucinated answers despite incorporating external knowledge sources.

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

samuelbroscheit/entity_knowledge_in_bert CONLL 2019

We show on an entity linking benchmark that (i) this model improves the entity representations over plain BERT, (ii) that it outperforms entity linking architectures that optimize the tasks separately and (iii) that it only comes second to the current state-of-the-art that does mention detection and entity disambiguation jointly.