Paper

CGI-Stereo: Accurate and Real-Time Stereo Matching via Context and Geometry Interaction

In this paper, we propose CGI-Stereo, a novel neural network architecture that can concurrently achieve real-time performance, competitive accuracy, and strong generalization ability. The core of our CGI-Stereo is a Context and Geometry Fusion (CGF) block which adaptively fuses context and geometry information for more effective cost aggregation and meanwhile provides feedback to feature learning to guide more effective contextual feature extraction. The proposed CGF can be easily embedded into many existing stereo matching networks, such as PSMNet, GwcNet and ACVNet. The resulting networks show a significant improvement in accuracy. Specially, the model which incorporates our CGF with ACVNet ranks $1^{st}$ on the KITTI 2012 and 2015 leaderboards among all the published methods. We further propose an informative and concise cost volume, named Attention Feature Volume (AFV), which exploits a correlation volume as attention weights to filter a feature volume. Based on CGF and AFV, the proposed CGI-Stereo outperforms all other published real-time methods on KITTI benchmarks and shows better generalization ability than other real-time methods. Code is available at https://github.com/gangweiX/CGI-Stereo.

Results in Papers With Code
(↓ scroll down to see all results)