Search Results for author: Yu-Xiao Guo

Found 6 papers, 4 papers with code

MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion

no code implementations22 Feb 2024 Xin-Yang Zheng, Hao Pan, Yu-Xiao Guo, Xin Tong, Yang Liu

By finetuning pretrained large image diffusion models with 3D data, the MVD methods first generate multiple views of a 3D object based on an image or text prompt and then reconstruct 3D shapes with multiview 3D reconstruction.

3D Reconstruction

Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding

1 code implementation22 Feb 2024 Yu-Qi Yang, Yu-Xiao Guo, Yang Liu

Data diversity and abundance are essential for improving the performance and generalization of models in natural language processing and 2D vision.

Scene Understanding

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

2 code implementations14 Apr 2023 Yu-Qi Yang, Yu-Xiao Guo, Jian-Yu Xiong, Yang Liu, Hao Pan, Peng-Shuai Wang, Xin Tong, Baining Guo

We pretrained a large {\SST} model on a synthetic Structured3D dataset, which is an order of magnitude larger than the ScanNet dataset.

Ranked #2 on 3D Object Detection on S3DIS (using extra training data)

3D Object Detection Scene Understanding +1

Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images

1 code implementation ICCV 2021 Ming-Jia Yang, Yu-Xiao Guo, Bin Zhou, Xin Tong

Different from existing methods that represent an indoor scene with the type, location, and other properties of objects in the room and learn the scene layout from a collection of complete 3D indoor scenes, our method models each indoor scene as a 3D semantic scene volume and learns a volumetric generative adversarial network (GAN) from a collection of 2. 5D partial observations of 3D scenes.

Generative Adversarial Network Scene Generation

View-volume Network for Semantic Scene Completion from a Single Depth Image

no code implementations14 Jun 2018 Yu-Xiao Guo, Xin Tong

We introduce a View-Volume convolutional neural network (VVNet) for inferring the occupancy and semantic labels of a volumetric 3D scene from a single depth image.

3D Semantic Scene Completion

Cannot find the paper you are looking for? You can Submit a new open access paper.