Curvature-guided dynamic scale networks for Multi-view Stereo

ICLR 2022  ·  Khang Truong Giang, Soohwan Song, Sungho Jo ·

Multi-view stereo (MVS) is a crucial task for precise 3D reconstruction. Most recent studies tried to improve the performance of matching cost volume in MVS by designing aggregated 3D cost volumes and their regularization. This paper focuses on learning a robust feature extraction network to enhance the performance of matching costs without heavy computation in the other steps. In particular, we present a dynamic scale feature extraction network, namely, CDSFNet. It is composed of multiple novel convolution layers, each of which can select a proper patch scale for each pixel guided by the normal curvature of the image surface. As a result, CDFSNet can estimate the optimal patch scales to learn discriminative features for accurate matching computation between reference and source images. By combining the robust extracted features with an appropriate cost formulation strategy, our resulting MVS architecture can estimate depth maps more precisely. Extensive experiments showed that the proposed method outperforms other state-of-the-art methods on complex outdoor scenes. It significantly improves the completeness of reconstructed models. As a result, the method can process higher resolution inputs within faster run-time and lower memory than other MVS methods. Our source code is available at url{https://github.com/TruongKhang/cds-mvsnet}.

PDF Abstract ICLR 2022 PDF ICLR 2022 Abstract

Results from the Paper


Ranked #9 on Point Clouds on Tanks and Temples (Mean F1 (Intermediate) metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Reconstruction DTU CDS-MVSNet Acc 0.351 # 14
Overall 0.315 # 9
Comp 0.278 # 8
Point Clouds Tanks and Temples CDS-MVSNet Mean F1 (Intermediate) 61.58 # 9

Methods