Causality Compensated Attention for Contextual Biased Visual Recognition

ICLR 2023  ·  Ruyang Liu, Jingjia Huang, Ge Li, Thomas H. Li ·

Visual attention does not always capture the essential object representation desired for robust predictions. Attention modules tend to underline not only the target object but also the common co-occurring context that the module thinks helpful in the training. The problem is rooted in the confounding effect of the context leading to incorrect causalities between objects and predictions, which is further exacerbated by visual attention. In this paper, to learn causal object features robust for contextual bias, we propose a novel attention module named Interventional Dual Attention (IDA) for visual recognition. Specifically, IDA adopts two attention layers with multiple sampling intervention, which compensates the attention against the confounder context. Note that our method is model-agnostic and thus can be implemented on various backbones. Extensive experiments show our model obtains significant improvements in classification and detection with lower computation. In particular, we achieve the state-of-the-art results in multi-label classification on MS-COCO and PASCAL-VOC.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Multi-Label Image Classification MSCOCO IDA-R101(H) mAP 84.8 # 1
Multi-Label Image Classification MSCOCO IDA-R101(H) 576 mAP 86.3 # 2
Multi-Label Image Classification MSCOCO IDA-SwinL(H) 384 mAP 90.3 # 3
Multi-Label Classification MS-COCO IDA-SwinL mAP 90.3 # 9
Multi-Label Classification MS-COCO IDA-R101 mAP 86.3 # 19


No methods listed for this paper. Add relevant methods here