A Multi-Task Semantic Decomposition Framework with Task-specific Pre-training for Few-Shot NER
The objective of few-shot named entity recognition is to identify named entities with limited labeled instances. Previous works have primarily focused on optimizing the traditional token-wise classification framework, while neglecting the exploration of information based on NER data characteristics. To address this issue, we propose a Multi-Task Semantic Decomposition Framework via Joint Task-specific Pre-training (MSDP) for few-shot NER. Drawing inspiration from demonstration-based and contrastive learning, we introduce two novel pre-training tasks: Demonstration-based Masked Language Modeling (MLM) and Class Contrastive Discrimination. These tasks effectively incorporate entity boundary information and enhance entity representation in Pre-trained Language Models (PLMs). In the downstream main task, we introduce a multi-task joint optimization framework with the semantic decomposing method, which facilitates the model to integrate two different semantic information for entity classification. Experimental results of two few-shot NER benchmarks demonstrate that MSDP consistently outperforms strong baselines by a large margin. Extensive analyses validate the effectiveness and generalization of MSDP.
PDF AbstractDatasets
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Few-shot NER | Few-NERD (INTER) | MSDP | 5 way 1~2 shot | 76.86±0.22 | # 1 | |
5 way 5~10 shot | 84.78±0.69 | # 1 | ||||
10 way 1~2 shot | 69.78±0.31 | # 1 | ||||
10 way 5~10 shot | 81.50±0.71 | # 1 | ||||
Few-shot NER | Few-NERD (INTRA) | MSDP | 5 way 1~2 shot | 56.35±0.28 | # 4 | |
5 way 5~10 shot | 66.80±0.78 | # 4 | ||||
10 way 1~2 shot | 47.13±0.69 | # 3 | ||||
10 way 5~10 shot | 64.69±0.51 | # 1 |