Multi-modal Named Entity Recognition

4 papers with code • 5 benchmarks • 0 datasets

Multi-modal named entity recognition aims at improving the accuracy of NER models through utilizing image information.

Most implemented papers

RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER

Multimodal-NER/RpBERT 5 Feb 2021

We integrate soft or hard gates to select visual clues and propose a multitask algorithm to train on the MNER datasets.

ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition

alibaba-nlp/kb-ner NAACL 2022

As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.

Named Entity and Relation Extraction with Multi-Modal Retrieval

modelscope/adaseq 3 Dec 2022

MoRe contains a text retrieval module and an image-based retrieval module, which retrieve related knowledge of the input text and image in the knowledge corpus respectively.

Prompting ChatGPT in MNER: Enhanced Multimodal Named Entity Recognition with Auxiliary Refined Knowledge

jinyuanli0012/pgim 20 May 2023

However, these methods either neglect the necessity of providing the model with external knowledge, or encounter issues of high redundancy in the retrieved knowledge.