MonoScene: Monocular 3D Semantic Scene Completion

CVPR 2022  ยท  Anh-Quan Cao, Raoul de Charette ยท

MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2D-3D features projection inspiring from optics and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Semantic Scene Completion from a single RGB image KITTI-360 MonoScene mIoU 12.31 # 2
3D Semantic Scene Completion KITTI-360 MonoScene mIoU 12.31 # 6
3D Semantic Scene Completion from a single RGB image NYUv2 MonoScene mIoU 26.94 # 2
3D Semantic Scene Completion NYUv2 MonoScene (RGB input only) mIoU 26.94 # 24
3D Semantic Scene Completion from a single RGB image SemanticKITTI MonoScene mIoU 11.08 # 5
3D Semantic Scene Completion SemanticKITTI MonoScene (RGB input only) mIoU 11.08 # 15

Methods


No methods listed for this paper. Add relevant methods here