Search Results for author: Xiaoshui Huang

Found 38 papers, 10 papers with code

3DBench: A Scalable 3D Benchmark and Instruction-Tuning Dataset

1 code implementation • 23 Apr 2024 • Junjie Zhang, Tianci Hu, Xiaoshui Huang, Yongshun Gong, Dan Zeng

Evaluating the performance of Multi-modal Large Language Models (MLLMs), integrating both point cloud and language, presents significant challenges.

Paper
Code

Foundation Model assisted Weakly Supervised LiDAR Semantic Segmentation

no code implementations • 19 Apr 2024 • Yilong Chen, Zongyi Xu, Xiaoshui Huang, Ruicheng Zhang, Xinqi Jiang, Xinbo Gao

Furthermore, to mitigate the influence of erroneous pseudo labels obtained from sparse annotations on point cloud features, we propose a multi-modal weakly supervised network for LiDAR semantic segmentation, called MM-ScatterNet.

Image Segmentation LIDAR Semantic Segmentation +3

Paper
Add Code

Taming Stable Diffusion for Text to 360° Panorama Image Generation

2 code implementations • 11 Apr 2024 • Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella, Xiaoshui Huang, Dinh Phung, Wanli Ouyang, Jianfei Cai

Generative models, e. g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts.

Denoising Image Generation

152

Paper
Code

GVGEN: Text-to-3D Generation with Volumetric Representation

no code implementations • 19 Mar 2024 • Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He

To simplify the generation of GaussianVolume and empower the model to generate instances with detailed 3D geometry, we propose a coarse-to-fine pipeline.

3D Generation 3D Reconstruction +1

Paper
Add Code

THOR: Text to Human-Object Interaction Diffusion via Relation Intervention

no code implementations • 17 Mar 2024 • Qianyang Wu, Ye Shi, Xiaoshui Huang, Jingyi Yu, Lan Xu, Jingya Wang

This paper addresses new methodologies to deal with the challenging task of generating dynamic Human-Object Interactions from textual descriptions (Text2HOI).

Human-Object Interaction Detection Object +1

Paper
Add Code

NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection

1 code implementation • 22 Feb 2024 • Chenxi Huang, Yuenan Hou, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang

We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors.

Depth Estimation Depth Prediction +1

Paper
Code

A Comprehensive Survey on 3D Content Generation

1 code implementation • 2 Feb 2024 • Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, WangMeng Zuo, Junjun Jiang, Xianming Liu

Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e. g., text, image, video, audio and 3D.

371

Paper
Code

3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V

no code implementations • 15 Dec 2023 • Dingning Liu, Xiaomeng Dong, Renrui Zhang, Xu Luo, Peng Gao, Xiaoshui Huang, Yongshun Gong, Zhihui Wang

In this work, we present a new visual prompting method called 3DAxiesPrompts (3DAP) to unleash the capabilities of GPT-4V in performing 3D spatial tasks.

3D Object Detection object-detection +1

Paper
Add Code

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

no code implementations • 14 Dec 2023 • Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang

Recent advancements in text-to-3D generation technology have significantly advanced the conversion of textual descriptions into imaginative well-geometrical and finely textured 3D objects.

3D Generation Text to 3D

Paper
Add Code

PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration

no code implementations • 11 Dec 2023 • Yue Wu, Yongzhe Yuan, Xiaolong Fan, Xiaoshui Huang, Maoguo Gong, Qiguang Miao

We propose a new framework that formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation.

Denoising Point Cloud Registration

Paper
Add Code

Zero-Shot Point Cloud Registration

no code implementations • 5 Dec 2023 • Weijie Wang, Guofeng Mei, Bin Ren, Xiaoshui Huang, Fabio Poiesi, Luc van Gool, Nicu Sebe, Bruno Lepri

The cornerstone of ZeroReg is the novel transfer of image features from keypoints to the point cloud, enriched by aggregating information from 3D geometric neighborhoods.

Point Cloud Registration

Paper
Add Code

A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling

no code implementations • 3 Dec 2023 • Wentao Qu, Yuantian Shao, Lingwu Meng, Xiaoshui Huang, Liang Xiao

Most of the existing point cloud upsampling methods focus on sparse point cloud feature extraction and upsampling module design.

Denoising point cloud upsampling

Paper
Add Code

Point Cloud Pre-training with Diffusion Models

no code implementations • 25 Nov 2023 • Xiao Zheng, Xiaoshui Huang, Guofeng Mei, Yuenan Hou, Zhaoyang Lyu, Bo Dai, Wanli Ouyang, Yongshun Gong

This generator aggregates the features extracted by the backbone and employs them as the condition to guide the point-to-point recovery from the noisy point cloud, thereby assisting the backbone in capturing both local and global geometric priors as well as the global point density distribution of the object.

Point Cloud Pre-training

Paper
Add Code

Experts Weights Averaging: A New General Training Scheme for Vision Transformers

no code implementations • 11 Aug 2023 • Yongqi Huang, Peng Ye, Xiaoshui Huang, Sheng Li, Tao Chen, Tong He, Wanli Ouyang

As Vision Transformers (ViTs) are gradually surpassing CNNs in various visual tasks, one may question: if a training scheme specifically for ViTs exists that can also achieve performance improvement without increasing inference cost?

Paper
Add Code

UniG3D: A Unified 3D Object Generation Dataset

no code implementations • 19 Jun 2023 • Qinghong Sun, Yangguang Li, Zexiang Liu, Xiaoshui Huang, Fenggang Liu, Xihui Liu, Wanli Ouyang, Jing Shao

However, the quality and diversity of existing 3D object generation methods are constrained by the inadequacies of existing 3D object datasets, including issues related to text quality, the incompleteness of multi-modal data representation encompassing 2D rendered images and 3D assets, as well as the size of the dataset.

Autonomous Driving Object

Paper
Add Code

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark

1 code implementation • NeurIPS 2023 • Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang

To the best of our knowledge, we present one of the very first open-source endeavors in the field, LAMM, encompassing a Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.

264

Paper
Code

Cross-source Point Cloud Registration: Challenges, Progress and Prospects

no code implementations • 23 May 2023 • Xiaoshui Huang, Guofeng Mei, Jian Zhang

The emerging topic of cross-source point cloud (CSPC) registration has attracted increasing attention with the fast development background of 3D sensor technologies.

Point Cloud Registration

Paper
Add Code

Unsupervised Deep Probabilistic Approach for Partial Point Cloud Registration

no code implementations • CVPR 2023 • Guofeng Mei, Hao Tang, Xiaoshui Huang, Weijie Wang, Juan Liu, Jian Zhang, Luc van Gool, Qiang Wu

Deep point cloud registration methods face challenges to partial overlaps and rely on labeled data.

Point Cloud Registration

Paper
Add Code

3D Point Cloud Pre-training with Knowledge Distillation from 2D Images

no code implementations • 17 Dec 2022 • Yuan YAO, Yuanhan Zhang, Zhenfei Yin, Jiebo Luo, Wanli Ouyang, Xiaoshui Huang

The recent success of pre-trained 2D vision models is mostly attributable to learning from large-scale datasets.

Concept Alignment Knowledge Distillation +6

Paper
Add Code

EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder

2 code implementations • 8 Dec 2022 • Xiaoshui Huang, Zhou Huang, Sheng Li, Wentao Qu, Tong He, Yuenan Hou, Yifan Zuo, Wanli Ouyang

These token embeddings are concatenated with a task token and fed into the frozen CLIP transformer to learn point cloud representation.

Few-Shot Learning Segmentation +1

264

Paper
Code

Boosting Semi-Supervised 3D Object Detection with Semi-Sampling

no code implementations • 14 Nov 2022 • Xiaopei Wu, Yang Zhao, Liang Peng, Hua Chen, Xiaoshui Huang, Binbin Lin, Haifeng Liu, Deng Cai, Wanli Ouyang

When training a teacher-student semi-supervised framework, we randomly select gt samples and pseudo samples to both labeled frames and unlabeled frames, making a strong data augmentation for them.

3D Object Detection Data Augmentation +2

Paper
Add Code

Multimodal Learning for Non-small Cell Lung Cancer Prognosis

no code implementations • 7 Nov 2022 • Yujiao Wu, Yaxiong Wang, Xiaoshui Huang, Fan Yang, Sai Ho Ling, Steven Weidong Su

This paper focuses on the task of survival time analysis for lung cancer.

Decision Making Survival Analysis

Paper
Add Code

GMF: General Multimodal Fusion Framework for Correspondence Outlier Rejection

1 code implementation • 1 Nov 2022 • Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao

In this paper, we propose General Multimodal Fusion (GMF) to learn to reject the correspondence outliers by leveraging both the structure and texture information.

Point Cloud Registration Position

Paper
Code

CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training

1 code implementation • ICCV 2023 • Tianyu Huang, Bowen Dong, Yunhan Yang, Xiaoshui Huang, Rynson W. H. Lau, Wanli Ouyang, WangMeng Zuo

To address this issue, we propose CLIP2Point, an image-depth pre-training method by contrastive learning to transfer CLIP to the 3D domain, and adapt it to point cloud classification.

Ranked #3 on Training-free 3D Point Cloud Classification on ScanObjectNN (using extra training data)

Contrastive Learning Few-Shot Learning +4

Paper
Code

Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Image Segmentation

no code implementations • 29 Jul 2022 • Shuchao Pang, Anan Du, Mehmet A. Orgun, Yan Wang, Quan Z. Sheng, Shoujin Wang, Xiaoshui Huang, Zhenmei Yu

Automatic tumor or lesion segmentation is a crucial step in medical image analysis for computer-aided diagnosis.

Image Segmentation Lesion Segmentation +3

Paper
Add Code

COTReg:Coupled Optimal Transport based Point Cloud Registration

no code implementations • 29 Dec 2021 • Guofeng Mei, Xiaoshui Huang, Litao Yu, Jian Zhang, Mohammed Bennamoun

Generating a set of high-quality correspondences or matches is one of the most critical steps in point cloud registration.

Point Cloud Registration

Paper
Add Code

GenReg: Deep Generative Method for Fast Point Cloud Registration

no code implementations • 23 Nov 2021 • Xiaoshui Huang, Zongyi Xu, Guofeng Mei, Sheng Li, Jian Zhang, Yifan Zuo, Yucheng Wang

To solve this challenge, we propose a new data-driven registration algorithm by investigating deep generative neural networks to point cloud registration.

Point Cloud Registration

Paper
Add Code

IMFNet: Interpretable Multimodal Fusion for Point Cloud Registration

1 code implementation • 18 Nov 2021 • Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao

In this paper, we propose a new multimodal fusion method to generate a point cloud registration descriptor by considering both structure and texture information.

Ranked #1 on Point Cloud Registration on 3DMatch Benchmark (using extra training data)

Point Cloud Registration

Paper
Code

DeepMMSA: A Novel Multimodal Deep Learning Method for Non-small Cell Lung Cancer Survival Analysis

no code implementations • 12 Jun 2021 • Yujiao Wu, Jie Ma, Xiaoshui Huang, Sai Ho Ling, Steven Weidong Su

To improve the survival prediction accuracy and help prognostic decision-making in clinical practice for medical experts, we for the first time propose a multimodal deep learning method for non-small cell lung cancer (NSCLC) survival analysis, named DeepMMSA.

Decision Making Multimodal Deep Learning +2

Paper
Add Code

A comprehensive survey on point cloud registration

no code implementations • 3 Mar 2021 • Xiaoshui Huang, Guofeng Mei, Jian Zhang, Rana Abbas

This survey conducts a comprehensive survey, including both same-source and cross-source registration methods, and summarize the connections between optimization-based and deep learning methods, to provide further research insight.

3D Reconstruction Point Cloud Registration

Paper
Add Code

Causal Discovery from Incomplete Data using An Encoder and Reinforcement Learning

no code implementations • 9 Jun 2020 • Xiaoshui Huang, Fujin Zhu, Lois Holloway, Ali Haidar

Compared with the direct combination of data imputation and causal discovery methods, our method performs generally better and can even obtain a performance gain as much as 43. 2%.

Causal Discovery Imputation +2

Paper
Add Code

Jointly Modeling Intra- and Inter-transaction Dependencies with Hierarchical Attentive Transaction Embeddings for Next-item Recommendation

no code implementations • 30 May 2020 • Shoujin Wang, Longbing Cao, Liang Hu, Shlomo Berkovsky, Xiaoshui Huang, Lin Xiao, Wenpeng Lu

Most existing TBRSs recommend next item by only modeling the intra-transaction dependency within the current transaction while ignoring inter-transaction dependency with recent transactions that may also affect the next item.

Recommendation Systems

Paper
Add Code

Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Images for Segmentation

no code implementations • 8 May 2020 • Shuchao Pang, Anan Du, Mehmet A. Orgun, Yan Wang, Quanzheng Sheng, Shoujin Wang, Xiaoshui Huang, Zhemei Yu

To mitigate this shortcoming, we propose a novel group equivariant segmentation framework by encoding those inherent symmetries for learning more precise representations.

Segmentation Tumor Segmentation

Paper
Add Code

Feature-metric Registration: A Fast Semi-supervised Approach for Robust Point Cloud Registration without Correspondences

1 code implementation • CVPR 2020 • Xiaoshui Huang, Guofeng Mei, Jian Zhang

We present a fast feature-metric point cloud registration framework, which enforces the optimisation of registration by minimising a feature-metric projection error without correspondences.

Point Cloud Registration

143

Paper
Code

Fast Registration for cross-source point clouds by using weak regional affinity and pixel-wise refinement

no code implementations • 11 Mar 2019 • Xiaoshui Huang, Lixin Fan, Qiang Wu, Jian Zhang, Chun Yuan

Accurate and fast registration of cross-source 3D point clouds from different sensors is an emerged research problem in computer vision.

Point Cloud Registration

Paper
Add Code

Learning a 3D descriptor for cross-source point cloud registration from synthetic data

no code implementations • 24 Aug 2017 • Xiaoshui Huang

As the development of 3D sensors, registration of 3D data (e. g. point cloud) coming from different kind of sensor is dispensable and shows great demanding.

Data Augmentation Point Cloud Registration

Paper
Add Code

A coarse-to-fine algorithm for registration in 3D street-view cross-source point clouds

no code implementations • 24 Oct 2016 • Xiaoshui Huang, Jian Zhang, Qiang Wu, Lixin Fan, Chun Yuan

In this paper, different from previous ICP-based methods, and from a statistic view, we propose a effective coarse-to-fine algorithm to detect and register a small scale SFM point cloud in a large scale Lidar point cloud.

Paper
Add Code

A Systematic Approach for Cross-source Point Cloud Registration by Preserving Macro and Micro Structures

no code implementations • 18 Aug 2016 • Xiaoshui Huang, Jian Zhang, Lixin Fan, Qiang Wu, Chun Yuan

We propose a systematic approach for registering cross-source point clouds.

graph construction Graph Matching +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.