PanoRecon: Real-Time Panoptic 3D Reconstruction from Monocular Video

CVPR 2024  ·  Dong Wu, Zike Yan, Hongbin Zha ·

We introduce the Panoptic 3D Reconstruction task a unified and holistic scene understanding task for a monocular video. And we present PanoRecon - a novel framework to address this new task which realizes an online geometry reconstruction alone with dense semantic and instance labeling. Specifically PanoRecon incrementally performs panoptic 3D reconstruction for each video fragment consisting of multiple consecutive key frames from a volumetric feature representation using feed-forward neural networks. We adopt a depth-guided back-projection strategy to sparse and purify the volumetric feature representation. We further introduce a voxel clustering module to get object instances in each local fragment and then design a tracking and fusion algorithm for the integration of instances from different fragments to ensure temporal coherence. Such design enables our PanoRecon to yield a coherent and accurate panoptic 3D reconstruction. Experiments on ScanNetV2 demonstrate a very competitive geometry reconstruction result compared with state-of-the-art reconstruction methods as well as promising 3D panoptic segmentation result with only RGB input while being real-time. Code is available at: https://github.com/Riser6/PanoRecon.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here