Search Results for author: Xiaoyang Wu

Found 23 papers, 16 papers with code

Multi-Space Alignments Towards Universal LiDAR Segmentation

1 code implementation CVPR 2024 Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma

A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception.

Autonomous Driving Diversity +1

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

1 code implementation CVPR 2024 Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia

This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module to greatly enhance the adaptivity of sparse CNNs at minimal computational cost.

Ranked #5 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric)

3D Semantic Segmentation LIDAR Semantic Segmentation

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

1 code implementation CVPR 2024 Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao

Multimodal Large Language Models (MLLMs) have excelled in 2D image-text comprehension and image generation, but their understanding of the 3D world is notably deficient, limiting progress in 3D language understanding and generation.

3D Generation Reading Comprehension

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm

1 code implementation12 Oct 2023 Haoyi Zhu, Honghui Yang, Xiaoyang Wu, Di Huang, Sha Zhang, Xianglong He, Hengshuang Zhao, Chunhua Shen, Yu Qiao, Tong He, Wanli Ouyang

In this paper, we introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation, thereby establishing a pathway to 3D foundational models.

Ranked #2 on Semantic Segmentation on ScanNet (using extra training data)

3D Object Detection 3D Reconstruction +5

Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training

1 code implementation CVPR 2024 Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao

In contrast, such privilege has not yet fully benefited 3D deep learning, mainly due to the limited availability of large-scale 3D datasets.

Ranked #3 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric, using extra training data)

3D Semantic Segmentation LIDAR Semantic Segmentation +1

MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds

1 code implementation CVPR 2023 Jiahui Liu, Chirui Chang, Jianhui Liu, Xiaoyang Wu, Lan Ma, Xiaojuan Qi

Unlike the single-scan-based semantic segmentation task, this task requires distinguishing the motion states of points in addition to their semantic categories.

3D Semantic Segmentation Representation Learning +1

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract

no code implementations27 Jun 2023 Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu, Jingyong Su, Jiaya Jia

We hope our work can benefit broader industrial applications where novel classes with limited annotations are required to be decently identified.

Few-Shot Semantic Segmentation Segmentation +2

SAM3D: Segment Anything in 3D Scenes

1 code implementation6 Jun 2023 Yunhan Yang, Xiaoyang Wu, Tong He, Hengshuang Zhao, Xihui Liu

In this work, we propose SAM3D, a novel framework that is able to predict masks in 3D point clouds by leveraging the Segment-Anything Model (SAM) in RGB images without further training or finetuning.


OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

no code implementations2 Jun 2023 Zhangyang Qi, Jiaqi Wang, Xiaoyang Wu, Hengshuang Zhao

Multi-view 3D object detection is becoming popular in autonomous driving due to its high effectiveness and low cost.

3D Object Detection Autonomous Driving +3

Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning

1 code implementation CVPR 2023 Xiaoyang Wu, Xin Wen, Xihui Liu, Hengshuang Zhao

As a pioneering work, PointContrast conducts unsupervised 3D representation learning via leveraging contrastive learning over raw RGB-D frames and proves its effectiveness on various downstream tasks.

Ranked #11 on Semantic Segmentation on ScanNet (val mIoU metric, using extra training data)

Contrastive Learning Data Augmentation +3

GeoSpark: Sparking up Point Cloud Segmentation with Geometry Clue

no code implementations14 Mar 2023 Zhening Huang, Xiaoyang Wu, Hengshuang Zhao, Lei Zhu, Shujun Wang, Georgios Hadjidemetriou, Ioannis Brilakis

For feature aggregation, it improves feature modeling by allowing the network to learn from both local points and neighboring geometry partitions, resulting in an enlarged data-tailored receptive field.

Point Cloud Segmentation

Understanding Imbalanced Semantic Segmentation Through Neural Collapse

2 code implementations CVPR 2023 Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia

Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers.

3D Semantic Segmentation Segmentation

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

2 code implementations11 Oct 2022 Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao

In this work, we analyze the limitations of the Point Transformer and propose our powerful and efficient Point Transformer V2 model with novel designs that overcome the limitations of previous work.

3D Point Cloud Classification 3D Semantic Segmentation +5

A practical convolutional neural network as loop filter for intra frame

no code implementations16 May 2018 Xiaodan Song, Jiabao Yao, Lulu Zhou, Li Wang, Xiaoyang Wu, Di Xie, ShiLiang Pu

It aims to design a single CNN model with low redundancy to adapt to decoded frames with different qualities and ensure consistency.


Cannot find the paper you are looking for? You can Submit a new open access paper.