Multi-modal Named Entity Recognition

4 papers with code • 5 benchmarks • 0 datasets

Multi-modal named entity recognition aims at improving the accuracy of NER models through utilizing image information.

Benchmarks

Add a Result

These leaderboards are used to track progress in Multi-modal Named Entity Recognition

Dataset	Best Model	Compare
Twitter-15	PGIM	See all
SNAP (MNER)	MoRe-MoE	See all
WikiDiverse	MoRe-MoE	See all
Twitter-2017	PGIM	See all
Twitter-17	ITA-All	See all

Most implemented papers

Most implemented Social Latest No code

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Multimodal-NER/RpBERT • • 5 Feb 2021

We integrate soft or hard gates to select visual clues and propose a multitask algorithm to train on the MNER datasets.

Paper
Code

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

alibaba-nlp/kb-ner • • NAACL 2022

As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.

Paper
Code

Named Entity and Relation Extraction with Multi-Modal Retrieval

modelscope/adaseq • • 3 Dec 2022

MoRe contains a text retrieval module and an image-based retrieval module, which retrieve related knowledge of the input text and image in the knowledge corpus respectively.

Paper
Code

Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

jinyuanli0012/pgim • • 20 May 2023

However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge.

Paper
Code

Multi-modal Named Entity Recognition

Benchmarks Add a Result

Most implemented papers

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

Named Entity and Relation Extraction with Multi-Modal Retrieval

Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

Content

Benchmarks

Add a Result