Based on the object depth, the dense coordinates patch together with the corresponding object features is reprojected to the image space to build a cost volume in a joint semantic and geometric error manner.
With the help of gated recovery units(GRU) and attention mechanisms as temporal units, we propose a point cloud completion framework that accepts a sequence of unaligned and sparse inputs, and outputs consistent and aligned point clouds.
The experimental results show that our method achieves the state-of-the-art performance on the monocular 3D Object Detection and Birds Eye View tasks of the KITTI dataset, and can generalize to images with different camera intrinsics.
Ranked #10 on Monocular 3D Object Detection on KITTI Cars Moderate
Our method, called Stereo R-CNN, extends Faster R-CNN for stereo inputs to simultaneously detect and associate object in left and right images.
We encode the sparse 3D point cloud with a compact multi-view representation.
In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection.
We then exploit a CNN on top of these proposals to perform object detection.
The focus of this paper is on proposal generation.
Ranked #8 on Vehicle Pose Estimation on KITTI Cars Hard
The goal of this paper is to generate high-quality 3D object proposals in the context of autonomous driving.
Ranked #10 on Vehicle Pose Estimation on KITTI Cars Hard
Based on the characteristics of superpixel tightness distribution, we propose an effective method, namely multi-thresholding straddling expansion (MTSE) to reduce localization bias via fast diversification.