CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning

Named Entity Recognition (NER) in Few-Shot setting is imperative for entity tagging in low resource domains. Existing approaches only learn class-specific semantic features and intermediate representations from source domains. This affects generalizability to unseen target domains, resulting in suboptimal performances. To this end, we present CONTaiNER, a novel contrastive learning technique that optimizes the inter-token distribution distance for Few-Shot NER. Instead of optimizing class-specific attributes, CONTaiNER optimizes a generalized objective of differentiating between token categories based on their Gaussian-distributed embeddings. This effectively alleviates overfitting issues originating from training domains. Our experiments in several traditional test domains (OntoNotes, CoNLL'03, WNUT '17, GUM) and a new large scale Few-Shot NER dataset (Few-NERD) demonstrate that on average, CONTaiNER outperforms previous methods by 3%-13% absolute F1 points while showing consistent performance trends, even in challenging scenarios where previous approaches could not achieve appreciable performance.

PDF Abstract ACL 2022 PDF ACL 2022 Abstract

Results from the Paper


Ranked #2 on Few-shot NER on Few-NERD (INTRA) (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Few-shot NER Few-NERD (INTER) CONTaiNER 5 way 1~2 shot 55.95 # 3
5 way 5~10 shot 61.83 # 3
10 way 1~2 shot 48.35 # 3
10 way 5~10 shot 57.12 # 3
Few-shot NER Few-NERD (INTRA) CONTaiNER 5 way 1~2 shot 40.43 # 2
5 way 5~10 shot 53.70 # 2
10 way 1~2 shot 33.84 # 2
10 way 5~10 shot 47.49 # 2

Methods