While multi-class 3D detectors are needed in many robotics applications, training them with fully labeled datasets can be expensive in labeling cost.
no code implementations • 22 Dec 2021 • Jingxiao Zheng, Xinwei Shi, Alexander Gorban, Junhua Mao, Yang song, Charles R. Qi, Ting Liu, Visesh Chari, Andre Cornman, Yin Zhou, CongCong Li, Dragomir Anguelov
3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors, including the 3D resolution and range of data, absence of dense depth maps, failure modes for LiDAR, relative location between the camera and LiDAR, and a high bar for estimation accuracy.
Given the insight that SDE would benefit from more accurate geometry descriptions, we propose to represent objects as amodal contours, specifically amodal star-shaped polygons, and devise a simple model, StarPoly, to predict such contours.
However, most prior work focus on the generic point cloud representation, neglecting the spatial patterns of the points from lidar range images.
On the Waymo Open Dataset and KITTI, SPG improves 3D detection results of these two methods across all categories.
Ranked #2 on 3D Object Detection on KITTI Cars Hard
While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely under-explored, such as using machines to automatically generate high-quality 3D labels.
no code implementations • • Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao, Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles R. Qi, Yin Zhou, Zoey Yang, Aurelien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander McCauley, Jonathon Shlens, Dragomir Anguelov
Furthermore, we introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models.
To this end, we select a suite of diverse datasets and tasks to measure the effect of unsupervised pre-training on a large source set of 3D scenes.
We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid.
Compared to prior work on multi-modal detection, we explicitly extract both geometric and semantic features from the 2D images.
Ranked #1 on 3D Object Detection on SUN-RGBD (using extra training data)
Current 3D object detection methods are heavily influenced by 2D detectors.
Ranked #9 on 3D Object Detection on SUN-RGBD val
Furthermore, these locations are continuous in space and can be learned by the network.
Ranked #1 on 3D Semantic Segmentation on SensatUrban
Deep neural networks are known to be vulnerable to adversarial examples which are carefully crafted instances to cause the models to make wrong predictions.
The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks.
DeePa is a deep learning framework that explores parallelism in all parallelizable dimensions to accelerate the training process of convolutional neural networks.
In this work, we study 3D object detection from RGB-D data in both indoor and outdoor scenes.
Ranked #1 on Object Localization on KITTI Cyclists Moderate
By exploiting metric space distances, our network is able to learn local features with increasing contextual scales.
Ranked #3 on 3D Semantic Segmentation on KITTI-360
Point cloud is an important type of geometric data structure.
Ranked #2 on Scene Segmentation on ScanNet
Each field probing filter is a set of probing points --- sensors that perceive the space.
Ranked #3 on 3D Object Recognition on ModelNet40
Empirical results from these two types of CNNs exhibit a large gap, indicating that existing volumetric CNN architectures and approaches are unable to fully exploit the power of 3D representations.
Ranked #1 on 3D Object Recognition on ModelNet40
Object viewpoint estimation from 2D images is an essential task in computer vision.