Global Context Networks

24 Dec 2020  ·  Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu ·

The Non-Local Network (NLNet) presents a pioneering approach for capturing long-range dependencies within an image, via aggregating query-specific global context to each query position. However, through a rigorous empirical analysis, we have found that the global contexts modeled by the non-local network are almost the same for different query positions... In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation. We further replace the one-layer transformation function of the non-local block by a two-layer bottleneck, which further reduces the parameter number considerably. The resulting network element, called the global context (GC) block, effectively models global context in a lightweight manner, allowing it to be applied at multiple layers of a backbone network to form a global context network (GCNet). Experiments show that GCNet generally outperforms NLNet on major benchmarks for various recognition tasks. The code and network configurations are available at read more

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Object Detection COCO minival GCNet (ResNeXt-101 + DCN + cascade + GC r4) box AP 51.8 # 8
AP50 70.4 # 6
AP75 56.1 # 4
Instance Segmentation COCO minival GCNet (ResNeXt-101 + DCN + cascade + GC r4) mask AP 44.7 # 8
AP50 67.9 # 1
AP75 48.4 # 3
Object Detection COCO test-dev GCNet (ResNeXt-101 + DCN + cascade + GC r4) box AP 52.3 # 24
AP50 70.9 # 21
AP75 56.9 # 22
Instance Segmentation COCO test-dev GCNet (ResNeXt-101 + DCN + cascade + GC r4) mask AP 45.4 # 8
AP50 68.9 # 4
AP75 49.6 # 4