Multi-modal Named Entity Recognition
3 papers with code • 4 benchmarks • 0 datasets
Multi-modal named entity recognition aims at improving the accuracy of NER models through utilizing image information.
Most implemented papers
RpBERT: A Text-image Relation Propagation-based BERT Model for Multimodal NER
We integrate soft or hard gates to select visual clues and propose a multitask algorithm to train on the MNER datasets.
ITA: Image-Text Alignments for Multi-Modal Named Entity Recognition
As text representations take the most important role in MNER, in this paper, we propose {\bf I}mage-{\bf t}ext {\bf A}lignments (ITA) to align image features into the textual space, so that the attention mechanism in transformer-based pretrained textual embeddings can be better utilized.
Named Entity and Relation Extraction with Multi-Modal Retrieval
MoRe contains a text retrieval module and an image-based retrieval module, which retrieve related knowledge of the input text and image in the knowledge corpus respectively.