TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Grammatical Error Detection	FCE	Bi-LSTM + charattn	F0.5	41.88	# 7
Part-Of-Speech Tagging	Penn Treebank	Bi-LSTM + charattn	Accuracy	97.27	# 18

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/attending-to-characters-in-neural-sequence/grammatical-error-detection-on-fce)](https://paperswithcode.com/sota/grammatical-error-detection-on-fce?p=attending-to-characters-in-neural-sequence)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/attending-to-characters-in-neural-sequence/part-of-speech-tagging-on-penn-treebank)](https://paperswithcode.com/sota/part-of-speech-tagging-on-penn-treebank?p=attending-to-characters-in-neural-sequence)`

Attending to Characters in Neural Sequence Labeling Models

COLING 2016 · Marek Rei, Gamal K. O. Crichton, Sampo Pyysalo ·

Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words. We investigate character-level extensions to such models and propose a novel architecture for combining alternative word representations. By using an attention mechanism, the model is able to dynamically decide how much information to use from a word- or character-level component. We evaluated different architectures on a range of sequence labeling datasets, and character-level extensions were found to improve performance on every benchmark. In addition, the proposed attention-based architecture delivered the best results even with a smaller number of trainable parameters.