Grounded Multimodal Named Entity Recognition
3 papers with code • 1 benchmarks • 0 datasets
Most implemented papers
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition
Grounded Multimodal Named Entity Recognition (GMNER) is a nascent multimodal task that aims to identify named entities, entity types and their corresponding visual regions.
Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation
Grounded Multimodal Named Entity Recognition (GMNER) task aims to identify named entities, entity types and their corresponding visual regions.
Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition
To tackle these, we propose a novel unified framework named Multi-grained Query-guided Set Prediction Network (MQSPN) to learn appropriate relationships at intra-entity and inter-entity levels.