Referring Image Segmentation via Cross-Modal Progressive Comprehension

Referring image segmentation aims at segmenting the foreground masks of the entities that can well match the description given in the natural language expression. Previous approaches tackle this problem using implicit feature interaction and fusion between visual and linguistic modalities, but usually fail to explore informative words of the expression to well align features from the two modalities for accurately identifying the referred entity... (read more)

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract

Datasets


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Referring Expression Segmentation RefCOCO testA CPMC IoU 64.53 # 1
Referring Expression Segmentation RefCOCO+ testA CPMC Overall IoU 53.44 # 1
Referring Expression Segmentation RefCOCO testB CPMC IoU 59.64 # 1
Referring Expression Segmentation RefCOCO+ test B CPMC Overall IoU 43.23 # 1
Referring Expression Segmentation RefCoCo val CPMC IoU 61.36 # 1
Referring Expression Segmentation RefCOCO+ val CPMC Overall IoU 49.56 # 1

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet