Detecting and localizing objects in the real 3D space, which plays a crucial role in scene understanding, is particularly challenging given only a monocular image due to the geometric information loss during imagery projection.
A crucial task in scene understanding is 3D object detection, which aims to detect and localize the 3D bounding boxes of objects belonging to specific classes.
Experimental results show that our uncertainty modeling is effective at alleviating the interference of background frames and brings a large performance gain without bells and whistles.
In this paper, we present a joint multi-task learning framework for semantic segmentation and boundary detection.
In this paper, we study the problem of 3D object detection from stereo images, in which the key challenge is how to effectively utilize stereo information.
We propose MonoGRNet for the amodal 3D object detection from a monocular RGB image via geometric reasoning in both the observed 2D projection and the unobserved depth dimension.
Ranked #19 on Monocular 3D Object Detection on KITTI Cars Moderate
In this paper, we address the problem of reconstructing an object's surface from a single image using generative networks.
In this paper, we propose a scale-invariant image matching approach to tackling the very large scale variation of views.
In this paper, we tackle the accurate and consistent Structure from Motion (SfM) problem, in particular camera registration, far exceeding the memory of a single computer in parallel.
In this paper, we propose a structural segmentation algorithm to partition multi-view stereo reconstructed surfaces of large-scale urban environments into structural segments.