Search Results for author: Xiaoshui Huang

Found 38 papers, 10 papers with code

3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset

1 code implementation23 Apr 2024 Junjie Zhang, Tianci Hu, Xiaoshui Huang, Yongshun Gong, Dan Zeng

Evaluating the performance of Multi-modal Large Language Models (MLLMs), integrating both point cloud and language, presents significant challenges.

Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation

no code implementations19 Apr 2024 Yilong Chen, Zongyi Xu, Xiaoshui Huang, Ruicheng Zhang, Xinqi Jiang, Xinbo Gao

Furthermore, to mitigate the influence of erroneous pseudo labels obtained from sparse annotations on point cloud features, we propose a multi-modal weakly supervised network for LiDAR semantic segmentation, called MM-ScatterNet.

Image Segmentation LIDAR Semantic Segmentation +3

Taming Stable Diffusion for Text to 360° Panorama Image Generation

2 code implementations11 Apr 2024 Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella, Xiaoshui Huang, Dinh Phung, Wanli Ouyang, Jianfei Cai

Generative models, e. g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts.

Denoising Image Generation

GVGEN: Text-to-3D Generation with Volumetric Representation

no code implementations19 Mar 2024 Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He

To simplify the generation of GaussianVolume and empower the model to generate instances with detailed 3D geometry, we propose a coarse-to-fine pipeline.

3D Generation 3D Reconstruction +1

THOR: Text to Human-Object Interaction Diffusion via Relation Intervention

no code implementations17 Mar 2024 Qianyang Wu, Ye Shi, Xiaoshui Huang, Jingyi Yu, Lan Xu, Jingya Wang

This paper addresses new methodologies to deal with the challenging task of generating dynamic Human-Object Interactions from textual descriptions (Text2HOI).

Human-Object Interaction Detection Object +1

NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection

1 code implementation22 Feb 2024 Chenxi Huang, Yuenan Hou, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang

We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors.

Depth Estimation Depth Prediction +1

A Comprehensive Survey on 3D Content Generation

1 code implementation2 Feb 2024 Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, WangMeng Zuo, Junjun Jiang, Xianming Liu

Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e. g., text, image, video, audio and 3D.

3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V

no code implementations15 Dec 2023 Dingning Liu, Xiaomeng Dong, Renrui Zhang, Xu Luo, Peng Gao, Xiaoshui Huang, Yongshun Gong, Zhihui Wang

In this work, we present a new visual prompting method called 3DAxiesPrompts (3DAP) to unleash the capabilities of GPT-4V in performing 3D spatial tasks.

3D Object Detection object-detection +1

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

no code implementations14 Dec 2023 Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang

Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.

3D Generation Text to 3D

PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration

no code implementations11 Dec 2023 Yue Wu, Yongzhe Yuan, Xiaolong Fan, Xiaoshui Huang, Maoguo Gong, Qiguang Miao

We propose a new framework that formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation.

Denoising Point Cloud Registration

Zero-Shot Point Cloud Registration

no code implementations5 Dec 2023 Weijie Wang, Guofeng Mei, Bin Ren, Xiaoshui Huang, Fabio Poiesi, Luc van Gool, Nicu Sebe, Bruno Lepri

The cornerstone of ZeroReg is the novel transfer of image features from keypoints to the point cloud, enriched by aggregating information from 3D geometric neighborhoods.

Point Cloud Registration

A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling

no code implementations3 Dec 2023 Wentao Qu, Yuantian Shao, Lingwu Meng, Xiaoshui Huang, Liang Xiao

Most of the existing point cloud upsampling methods focus on sparse point cloud feature extraction and upsampling module design.

Denoising point cloud upsampling

Point Cloud Pre-training with Diffusion Models

no code implementations25 Nov 2023 Xiao Zheng, Xiaoshui Huang, Guofeng Mei, Yuenan Hou, Zhaoyang Lyu, Bo Dai, Wanli Ouyang, Yongshun Gong

This generator aggregates the features extracted by the backbone and employs them as the condition to guide the point-to-point recovery from the noisy point cloud, thereby assisting the backbone in capturing both local and global geometric priors as well as the global point density distribution of the object.

Point Cloud Pre-training

Experts Weights Averaging: A New General Training Scheme for Vision Transformers

no code implementations11 Aug 2023 Yongqi Huang, Peng Ye, Xiaoshui Huang, Sheng Li, Tao Chen, Tong He, Wanli Ouyang

As Vision Transformers (ViTs) are gradually surpassing CNNs in various visual tasks, one may question: if a training scheme specifically for ViTs exists that can also achieve performance improvement without increasing inference cost?

UniG3D: A Unified 3D Object Generation Dataset

no code implementations19 Jun 2023 Qinghong Sun, Yangguang Li, Zexiang Liu, Xiaoshui Huang, Fenggang Liu, Xihui Liu, Wanli Ouyang, Jing Shao

However, the quality and diversity of existing 3D object generation methods are constrained by the inadequacies of existing 3D object datasets, including issues related to text quality, the incompleteness of multi-modal data representation encompassing 2D rendered images and 3D assets, as well as the size of the dataset.

Autonomous Driving Object

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark

1 code implementation NeurIPS 2023 Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang

To the best of our knowledge, we present one of the very first open-source endeavors in the field, LAMM, encompassing a Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.

Cross-source Point Cloud Registration: Challenges, Progress and Prospects

no code implementations23 May 2023 Xiaoshui Huang, Guofeng Mei, Jian Zhang

The emerging topic of cross-source point cloud (CSPC) registration has attracted increasing attention with the fast development background of 3D sensor technologies.

Point Cloud Registration

EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder

2 code implementations8 Dec 2022 Xiaoshui Huang, Zhou Huang, Sheng Li, Wentao Qu, Tong He, Yuenan Hou, Yifan Zuo, Wanli Ouyang

These token embeddings are concatenated with a task token and fed into the frozen CLIP transformer to learn point cloud representation.

Few-Shot Learning Segmentation +1

Boosting Semi-Supervised 3D Object Detection with Semi-Sampling

no code implementations14 Nov 2022 Xiaopei Wu, Yang Zhao, Liang Peng, Hua Chen, Xiaoshui Huang, Binbin Lin, Haifeng Liu, Deng Cai, Wanli Ouyang

When training a teacher-student semi-supervised framework, we randomly select gt samples and pseudo samples to both labeled frames and unlabeled frames, making a strong data augmentation for them.

3D Object Detection Data Augmentation +2

GMF: General Multimodal Fusion Framework for Correspondence Outlier Rejection

1 code implementation1 Nov 2022 Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao

In this paper, we propose General Multimodal Fusion (GMF) to learn to reject the correspondence outliers by leveraging both the structure and texture information.

Point Cloud Registration Position

CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training

1 code implementation ICCV 2023 Tianyu Huang, Bowen Dong, Yunhan Yang, Xiaoshui Huang, Rynson W. H. Lau, Wanli Ouyang, WangMeng Zuo

To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification.

Contrastive Learning Few-Shot Learning +4

COTReg:Coupled Optimal Transport based Point Cloud Registration

no code implementations29 Dec 2021 Guofeng Mei, Xiaoshui Huang, Litao Yu, Jian Zhang, Mohammed Bennamoun

Generating a set of high-quality correspondences or matches is one of the most critical steps in point cloud registration.

Point Cloud Registration

GenReg: Deep Generative Method for Fast Point Cloud Registration

no code implementations23 Nov 2021 Xiaoshui Huang, Zongyi Xu, Guofeng Mei, Sheng Li, Jian Zhang, Yifan Zuo, Yucheng Wang

To solve this challenge, we propose a new data-driven registration algorithm by investigating deep generative neural networks to point cloud registration.

Point Cloud Registration

IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration

1 code implementation18 Nov 2021 Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao

In this paper, we propose a new multimodal fusion method to generate a point cloud registration descriptor by considering both structure and texture information.

 Ranked #1 on Point Cloud Registration on 3DMatch Benchmark (using extra training data)

Point Cloud Registration

DeepMMSA: A Novel Multimodal Deep Learning Method for Non-small Cell Lung Cancer Survival Analysis

no code implementations12 Jun 2021 Yujiao Wu, Jie Ma, Xiaoshui Huang, Sai Ho Ling, Steven Weidong Su

To improve the survival prediction accuracy and help prognostic decision-making in clinical practice for medical experts, we for the first time propose a multimodal deep learning method for non-small cell lung cancer (NSCLC) survival analysis, named DeepMMSA.

Decision Making Multimodal Deep Learning +2

A comprehensive survey on point cloud registration

no code implementations3 Mar 2021 Xiaoshui Huang, Guofeng Mei, Jian Zhang, Rana Abbas

This survey conducts a comprehensive survey, including both same-source and cross-source registration methods, and summarize the connections between optimization-based and deep learning methods, to provide further research insight.

3D Reconstruction Point Cloud Registration

Causal Discovery from Incomplete Data using An Encoder and Reinforcement Learning

no code implementations9 Jun 2020 Xiaoshui Huang, Fujin Zhu, Lois Holloway, Ali Haidar

Compared with the direct combination of data imputation and causal discovery methods, our method performs generally better and can even obtain a performance gain as much as 43. 2%.

Causal Discovery Imputation +2

Jointly Modeling Intra- and Inter-transaction Dependencies with Hierarchical Attentive Transaction Embeddings for Next-item Recommendation

no code implementations30 May 2020 Shoujin Wang, Longbing Cao, Liang Hu, Shlomo Berkovsky, Xiaoshui Huang, Lin Xiao, Wenpeng Lu

Most existing TBRSs recommend next item by only modeling the intra-transaction dependency within the current transaction while ignoring inter-transaction dependency with recent transactions that may also affect the next item.

Recommendation Systems

Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Images for Segmentation

no code implementations8 May 2020 Shuchao Pang, Anan Du, Mehmet A. Orgun, Yan Wang, Quanzheng Sheng, Shoujin Wang, Xiaoshui Huang, Zhemei Yu

To mitigate this shortcoming, we propose a novel group equivariant segmentation framework by encoding those inherent symmetries for learning more precise representations.

Segmentation Tumor Segmentation

Feature-metric Registration: A Fast Semi-supervised Approach for Robust Point Cloud Registration without Correspondences

1 code implementation CVPR 2020 Xiaoshui Huang, Guofeng Mei, Jian Zhang

We present a fast feature-metric point cloud registration framework, which enforces the optimisation of registration by minimising a feature-metric projection error without correspondences.

Point Cloud Registration

Fast Registration for cross-source point clouds by using weak regional affinity and pixel-wise refinement

no code implementations11 Mar 2019 Xiaoshui Huang, Lixin Fan, Qiang Wu, Jian Zhang, Chun Yuan

Accurate and fast registration of cross-source 3D point clouds from different sensors is an emerged research problem in computer vision.

Point Cloud Registration

Learning a 3D descriptor for cross-source point cloud registration from synthetic data

no code implementations24 Aug 2017 Xiaoshui Huang

As the development of 3D sensors, registration of 3D data (e. g. point cloud) coming from different kind of sensor is dispensable and shows great demanding.

Data Augmentation Point Cloud Registration

A coarse-to-fine algorithm for registration in 3D street-view cross-source point clouds

no code implementations24 Oct 2016 Xiaoshui Huang, Jian Zhang, Qiang Wu, Lixin Fan, Chun Yuan

In this paper, different from previous ICP-based methods, and from a statistic view, we propose a effective coarse-to-fine algorithm to detect and register a small scale SFM point cloud in a large scale Lidar point cloud.

Cannot find the paper you are looking for? You can Submit a new open access paper.