Inspired by the early works on indoor modeling, we leverage the structural regularities exhibited in indoor scenes, to train a better depth network.
This paper proposes a novel simultaneous localization and mapping (SLAM) approach, namely Attention-SLAM, which simulates human navigation mode by combining a visual saliency model (SalNavNet) with traditional monocular visual SLAM.
The experimental results show that the proposed method can surprisingly converge in a few iterations and achieve an accuracy of 91. 15% on a real IMU dataset, demonstrating the efficiency and effectiveness of the proposed method.
Instead of using Manhattan world assumption, we use Atlanta world model to describe such regularity.
This work addresses the outlier removal problem in large-scale global structure-from-motion.
We present a method to jointly estimate scene depth and recover the clear latent image from a foggy video sequence.