Search Results for author: Yu-Xiao Guo

Found 7 papers, 4 papers with code

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

no code implementations • 16 Apr 2024 • Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, Baining Guo

We introduce VASA, a framework for generating lifelike talking faces with appealing visual affective skills (VAS) given a single static image and a speech audio clip.

Paper
Add Code

MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion

no code implementations • 22 Feb 2024 • Xin-Yang Zheng, Hao Pan, Yu-Xiao Guo, Xin Tong, Yang Liu

By finetuning pretrained large image diffusion models with 3D data, the MVD methods first generate multiple views of a 3D object based on an image or text prompt and then reconstruct 3D shapes with multiview 3D reconstruction.

3D Generation 3D Reconstruction

Paper
Add Code

Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding

1 code implementation • 22 Feb 2024 • Yu-Qi Yang, Yu-Xiao Guo, Yang Liu

Data diversity and abundance are essential for improving the performance and generalization of models in natural language processing and 2D vision.

Scene Understanding

167

Paper
Code

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

2 code implementations • 14 Apr 2023 • Yu-Qi Yang, Yu-Xiao Guo, Jian-Yu Xiong, Yang Liu, Hao Pan, Peng-Shuai Wang, Xin Tong, Baining Guo

We pretrained a large {\SST} model on a synthetic Structured3D dataset, which is an order of magnitude larger than the ScanNet dataset.

Ranked #2 on 3D Object Detection on S3DIS (using extra training data)

3D Object Detection Scene Understanding +1

1,108

Paper
Code

Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images

1 code implementation • ICCV 2021 • Ming-Jia Yang, Yu-Xiao Guo, Bin Zhou, Xin Tong

Different from existing methods that represent an indoor scene with the type, location, and other properties of objects in the room and learn the scene layout from a collection of complete 3D indoor scenes, our method models each indoor scene as a 3D semantic scene volume and learns a volumetric generative adversarial network (GAN) from a collection of 2. 5D partial observations of 3D scenes.

Generative Adversarial Network Scene Generation

Paper
Code

View-volume Network for Semantic Scene Completion from a Single Depth Image

no code implementations • 14 Jun 2018 • Yu-Xiao Guo, Xin Tong

We introduce a View-Volume convolutional neural network (VVNet) for inferring the occupancy and semantic labels of a volumetric 3D scene from a single depth image.

Ranked #20 on 3D Semantic Scene Completion on NYUv2

3D Semantic Scene Completion

Paper
Add Code

O-CNN: Octree-based Convolutional Neural Networks for 3D Shape Analysis

1 code implementation • 5 Dec 2017 • Peng-Shuai Wang, Yang Liu, Yu-Xiao Guo, Chun-Yu Sun, Xin Tong

We present O-CNN, an Octree-based Convolutional Neural Network (CNN) for 3D shape analysis.

Ranked #4 on 3D Object Classification on ModelNet40

3D Object Classification Retrieval +1

697

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.