Depth sensors used in autonomous driving and gaming systems often report back 3D point clouds. The lack of structure from these sensors does not allow these systems to take advantage of recent advances in convolutional neural networks which are dependent upon traditional filtering and pooling operations. Analogous to image based convolutional architectures, recently introduced graph based architectures afford similar filtering and pooling operations on arbitrary graphs. We adopt these graph based methods to 3D point clouds to introduce a generic vector representation of 3D graphs, we call graph 3D (G3D). We believe we are the first to use large scale transfer learning on 3D point cloud data and demonstrate the discriminant power of our salient latent representation of 3D point clouds on unforeseen test sets. By using our G3D network (G3DNet) as a feature extractor, and then pairing G3D feature vectors with a standard classifier, we achieve the best accuracy on ModelNet10 (93.1%) and ModelNet 40 (91.7%) for a graph network, and comparable performance on the Sydney Urban Objects dataset to other methods. This general-purpose feature extractor can be used as an off-the-shelf component in other 3D scene understanding or object tracking works.

PDF Abstract

Results from the Paper


Ranked #2 on 3D Object Classification on ModelNet40 (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
3D Object Classification ModelNet10 G3DNet-18 SVM, Fine-Tuned, Vote Accuracy 93.1 # 3
3D Object Classification ModelNet40 G3DNet-18 MLP, Fine-Tuned, Vote Classification Accuracy 91.7 # 2
3D Point Cloud Classification ModelNet40 G3DNet-18 MLP, Fine-Tuned, Vote Overall Accuracy 91.7 # 87
3D Point Cloud Classification Sydney Urban Objects G3DNet-18 MLP Fine-Tuned, Vote F1 72.7 # 3

Methods