Low Resource Named Entity Recognition
8 papers with code • 3 benchmarks • 4 datasets
Low resource named entity recognition is the task of using data and models available for one language for which ample such resources are available (e.g., English) to solve named entity recognition tasks in another, commonly more low-resource, language.
Recent advances in language modeling using deep neural networks have shown that these models learn representations, that vary with the network depth from morphology to semantic relationships like co-reference.
In low-resource settings, the performance of supervised labeling models can be improved with automatically annotated or distantly supervised data, which is cheap to create but often noisy.
Recently, neural methods have achieved state-of-the-art (SOTA) results in Named Entity Recognition (NER) tasks for many languages without the need for manually crafted features.
However, designing such features for low-resource languages is challenging, because exhaustive entity gazetteers do not exist in these languages.
Recently, it has attracted much attention to build reliable named entity recognition (NER) systems using limited annotated data.
Distant supervision allows obtaining labeled training corpora for low-resource settings where only limited hand-annotated data exists.
Pre-trained language models (PLM) are effective components of few-shot named entity recognition (NER) approaches when augmented with continued pre-training on task-specific out-of-domain data or fine-tuning on in-domain data.