LidarMultiNet is extensively tested on both Waymo Open Dataset and nuScenes dataset, demonstrating for the first time that major LiDAR perception tasks can be unified in a single strong network that is trained end-to-end and achieves state-of-the-art performance.
It then uses the feature of the center candidate as the query embedding in the transformer.
Ranked #1 on 3D Object Detection on waymo vehicle
This technical report presents the 1st place winning solution for the Waymo Open Dataset 3D semantic segmentation challenge 2022.
Compared with popular sampling methods such as Farthest Point Sampling (FPS) and Ball Query, CAGQ achieves up to 50X speed-up.
We present a novel deep neural network architecture for end-to-end scene flow estimation that directly operates on large-scale 3D point clouds.
This framework 1) effectively enlarges the receptive fields (RF) of the network to aggregate global information; 2) alleviates what we call the "gridding issue" caused by the standard dilated convolution operation.
Ranked #20 on Semantic Segmentation on PASCAL VOC 2012 test
Our results show that, as in the behavioral data, the correlation between subordinate level face and object recognition accuracy increases as experience grows.
Our results suggest that the relative order of importance of using central visual field information is face recognition>object recognition>scene recognition, and vice-versa for peripheral information.
We instantiate this idea by training a deep CNN to perform basic level object categorization first, and then train it on subordinate level categorization.
Bottom up and top down attention are applied respectively in the process of acquiring interested object(saliency map) and object recognition.