Search Results for author: Xin Tong

Found 69 papers, 34 papers with code

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

no code implementations14 Apr 2023 Yu-Qi Yang, Yu-Xiao Guo, Jian-Yu Xiong, Yang Liu, Hao Pan, Peng-Shuai Wang, Xin Tong, Baining Guo

Based on this backbone design, we pretrained a large Swin3D model on a synthetic Structured3D dataset that is 10 times larger than the ScanNet dataset and fine-tuned the pretrained model in various downstream real-world indoor scene understanding tasks.

 Ranked #1 on Semantic Segmentation on ScanNet (using extra training data)

3D Object Detection Scene Understanding

3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining

no code implementations14 Apr 2023 Siming Yan, YuQi Yang, YuXiao Guo, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu, QiXing Huang

Masked autoencoders (MAE) have recently been introduced to 3D self-supervised pretraining for point clouds due to their great success in NLP and computer vision.

3D-aware Image Generation using 2D Diffusion Models

no code implementations31 Mar 2023 Jianfeng Xiang, Jiaolong Yang, Binbin Huang, Xin Tong

In this paper, we introduce a novel 3D-aware image generation method that leverages 2D diffusion models.

Image Generation

Synergistic Potential Functions from Single Modified Trace Function on SO(3)

no code implementations28 Mar 2023 Xin Tong, Shing Shin Cheng

Second, it can be shown that for each potential function in the family, there exists a subset of the family such that the synergistic gap is positive at the unwanted critical points.

ReBotNet: Fast Real-time Video Enhancement

no code implementations23 Mar 2023 Jeya Maria Jose Valanarasu, Rahul Garg, Andeep Toor, Xin Tong, Weijuan Xi, Andreas Lugmayr, Vishal M. Patel, Anne Menini

The first branch learns spatio-temporal features by tokenizing the input frames along the spatial and temporal dimensions using a ConvNext-based encoder and processing these abstract tokens using a bottleneck mixer.

Video Enhancement Video Restoration

RemoteTouch: Enhancing Immersive 3D Video Communication with Hand Touch

no code implementations28 Feb 2023 Yizhong Zhang, Zhiqi Li, Sicheng Xu, Chong Li, Jiaolong Yang, Xin Tong, Baining Guo

A key challenge in emulating the remote hand touch is the realistic rendering of the participant's hand and arm as the hand touches the screen.

NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation

no code implementations CVPR 2023 Yu Yin, Kamran Ghasedi, HsiangTao Wu, Jiaolong Yang, Xin Tong, Yun Fu

Furthermore, our method leverages explicit and implicit 3D regularizations using the in-domain neighborhood samples around the optimized latent code to remove geometrical and visual artifacts.

Image Animation

AniFaceGAN: Animatable 3D-Aware Face Image Generation for Video Avatars

no code implementations12 Oct 2022 Yue Wu, Yu Deng, Jiaolong Yang, Fangyun Wei, Qifeng Chen, Xin Tong

To achieve meaningful control over facial expressions via deformation, we propose a 3D-level imitative learning scheme between the generator and a parametric 3D face model during adversarial training of the 3D-aware GAN.

Disentanglement Face Model +1

Hierarchical Neyman-Pearson Classification for Prioritizing Severe Disease Categories in COVID-19 Patient Data

no code implementations1 Oct 2022 Lijia Wang, Y. X. Rachel Wang, Jingyi Jessica Li, Xin Tong

Here, we propose a hierarchical NP (H-NP) framework and an umbrella algorithm that generally adapts to popular classification methods and controls the under-diagnosis errors with high probability.

Binary Classification Classification +1

Generative Deformable Radiance Fields for Disentangled Image Synthesis of Topology-Varying Objects

no code implementations9 Sep 2022 Ziyu Wang, Yu Deng, Jiaolong Yang, Jingyi Yu, Xin Tong

Experiments show that our method can successfully learn the generative model from unstructured monocular images and well disentangle the shape and appearance for objects (e. g., chairs) with large topological variance.

Disentanglement Image Generation

Semantic Segmentation-Assisted Instance Feature Fusion for Multi-Level 3D Part Instance Segmentation

1 code implementation9 Aug 2022 ChunYu Sun, Xin Tong, Yang Liu

Our method exploits semantic segmentation to fuse nonlocal instance features, such as center prediction, and further enhances the fusion scheme in a multi- and cross-level way.

3D Instance Segmentation 3D Part Segmentation +1

Deep Deformable 3D Caricatures with Learned Shape Control

1 code implementation29 Jul 2022 Yucheol Jung, Wonjong Jang, Soongjin Kim, Jiaolong Yang, Xin Tong, Seungyong Lee

To achieve the goal, we propose an MLP-based framework for building a deformable surface model, which takes a latent code and produces a 3D surface.


Sparse Ellipsometry: Portable Acquisition of Polarimetric SVBRDF and Shape with Unstructured Flash Photography

1 code implementation9 Jul 2022 Inseung Hwang, Daniel S. Jeon, Adolfo Muñoz, Diego Gutierrez, Xin Tong, Min H. Kim

Ellipsometry techniques allow to measure polarization information of materials, requiring precise rotations of optical components with different configurations of lights and sensors.

Data Augmentation Inverse Rendering

Environment Sensing Considering the Occlusion Effect: A Multi-View Approach

no code implementations2 Jul 2022 Xin Tong, Zhaoyang Zhang, Yihan Zhang, Zhaohui Yang, Chongwen Huang, Kai-Kit Wong, Merouane Debbah

In this paper, we consider the problem of sensing the environment within a wireless cellular framework.

SDF-StyleGAN: Implicit SDF-Based StyleGAN for 3D Shape Generation

1 code implementation24 Jun 2022 Xin-Yang Zheng, Yang Liu, Peng-Shuai Wang, Xin Tong

We further complement the evaluation metrics of 3D generative models with the shading-image-based Fr\'echet inception distance (FID) scores to better assess visual quality and shape distribution of the generated shapes.

3D Shape Generation 3D Shape Representation

ComplexGen: CAD Reconstruction by B-Rep Chain Complex Generation

1 code implementation29 May 2022 Haoxiang Guo, Shilin Liu, Hao Pan, Yang Liu, Xin Tong, Baining Guo

We view the reconstruction of CAD models in the boundary representation (B-Rep) as the detection of geometric primitives of different orders, i. e. vertices, edges and surface patches, and the correspondence of primitives, which are holistically modeled as a chain complex, and show that by modeling such comprehensive structures more complete and regularized reconstructions can be achieved.

Dual Octree Graph Networks for Learning Adaptive Volumetric Shape Representations

1 code implementation5 May 2022 Peng-Shuai Wang, Yang Liu, Xin Tong

Our method encodes the volumetric field of a 3D shape with an adaptive feature volume organized by an octree and applies a compact multilayer perceptron network for mapping the features to the field value at each 3D position.

3D Shape Reconstruction

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

1 code implementation19 Apr 2022 Chun-Yu Sun, Yu-Qi Yang, Hao-Xiang Guo, Peng-Shuai Wang, Xin Tong, Yang Liu, Heung-Yeung Shum

We propose an effective semi-supervised method for learning 3D segmentations from a few labeled 3D shapes and a large amount of unlabeled 3D data.

Semantic Segmentation Semantic Segmentation on ScanNet +1

MPS-NeRF: Generalizable 3D Human Rendering from Multiview Images

no code implementations31 Mar 2022 Xiangjun Gao, Jiaolong Yang, Jongyoo Kim, Sida Peng, Zicheng Liu, Xin Tong

For this task, we propose a simple yet effective method to train a generalizable NeRF with multiview images as conditional input.

Novel View Synthesis

Transformer Based Line Segment Classifier With Image Context for Real-Time Vanishing Point Detection in Manhattan World

no code implementations CVPR 2022 Xin Tong, Xianghua Ying, Yongjie Shi, Ruibin Wang, Jinfa Yang

To achieve this goal, we propose a novel Transformer based Line segment Classifier (TLC) that can group line segments in images and estimate the corresponding vanishing points.

VirtualCube: An Immersive 3D Video Communication System

no code implementations13 Dec 2021 Yizhong Zhang, Jiaolong Yang, Zhen Liu, Ruicheng Wang, Guojun Chen, Xin Tong, Baining Guo

The VirtualCube system is a 3D video conference system that attempts to overcome some limitations of conventional technologies.

Depth Estimation

Sampling with Trusthworthy Constraints: A Variational Gradient Framework

1 code implementation NeurIPS 2021 Xingchao Liu, Xin Tong, Qiang Liu

In this work, we propose a family of constrained sampling algorithms which generalize Langevin Dynamics (LD) and Stein Variational Gradient Descent (SVGD) to incorporate a moment constraint specified by a general nonlinear function.

Bayesian Inference Fairness

Asymmetric error control under imperfect supervision: a label-noise-adjusted Neyman-Pearson umbrella algorithm

no code implementations1 Dec 2021 Shunan Yao, Bradley Rava, Xin Tong, Gareth James

It is somewhat surprising that even when common NP classifiers ignore the label noise in the training stage, they are still able to control the type I error with high probability.

Classification Medical Diagnosis +1

Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent

1 code implementation NeurIPS 2021 Xingchao Liu, Xin Tong, Qiang Liu

Finding diverse and representative Pareto solutions from the Pareto front is a key challenge in multi-objective optimization (MOO).

Joint Multi-User Communication and Sensing Exploiting Both Signal and Environment Sparsity

no code implementations6 Sep 2021 Xin Tong, Zhaoyang Zhang, Jue Wang, Chongwen Huang, Merouane Debbah

As a potential technology feature for 6G wireless networks, the idea of sensing-communication integration requires the system not only to complete reliable multi-user communication but also to achieve accurate environment sensing.

object-detection Object Detection

Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images

1 code implementation ICCV 2021 Ming-Jia Yang, Yu-Xiao Guo, Bin Zhou, Xin Tong

Different from existing methods that represent an indoor scene with the type, location, and other properties of objects in the room and learn the scene layout from a collection of complete 3D indoor scenes, our method models each indoor scene as a 3D semantic scene volume and learns a volumetric generative adversarial network (GAN) from a collection of 2. 5D partial observations of 3D scenes.

Scene Generation

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation

1 code implementation9 Jul 2021 Wonjong Jang, Gwangjin Ju, Yucheol Jung, Jiaolong Yang, Xin Tong, Seungyong Lee

Our framework, dubbed StyleCariGAN, automatically creates a realistic and detailed caricature from an input photo with optional controls on shape exaggeration degree and color stylization type.


Spline Positional Encoding for Learning 3D Implicit Signed Distance Fields

1 code implementation3 Jun 2021 Peng-Shuai Wang, Yang Liu, Yu-Qi Yang, Xin Tong

Multilayer perceptrons (MLPs) have been successfully used to represent 3D shapes implicitly and compactly, by mapping 3D coordinates to the corresponding signed distance values or occupancy values.

3D Shape Reconstruction Image Reconstruction

Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent

1 code implementation NeurIPS 2021 Xingchao Liu, Xin Tong, Qiang Liu

Finding diverse and representative Pareto solutions from the Pareto front is a key challenge in multi-objective optimization (MOO).

Sampling with Trusthworthy Constraints: A Variational Gradient Framework

1 code implementation NeurIPS 2021 Xingchao Liu, Xin Tong, Qiang Liu

In this work, we propose a family of constrained sampling algorithms which generalize Langevin Dynamics (LD) and Stein Variational Gradient Descent (SVGD) to incorporate a moment constraint specified by a general nonlinear function.

Bayesian Inference Fairness

High-Resolution Optical Flow from 1D Attention and Correlation

1 code implementation ICCV 2021 Haofei Xu, Jiaolong Yang, Jianfei Cai, Juyong Zhang, Xin Tong

Optical flow is inherently a 2D search problem, and thus the computational complexity grows quadratically with respect to the search window, making large displacements matching infeasible for high-resolution images.

Optical Flow Estimation Vocal Bursts Intensity Prediction

Group-Free 3D Object Detection via Transformers

3 code implementations ICCV 2021 Ze Liu, Zheng Zhang, Yue Cao, Han Hu, Xin Tong

Instead of grouping local points to each object candidate, our method computes the feature of an object from all the points in the point cloud with the help of an attention mechanism in the Transformers \cite{vaswani2017attention}, where the contribution of each point is automatically learned in the network training.

3D Object Detection object-detection

Deep Implicit Moving Least-Squares Functions for 3D Reconstruction

1 code implementation CVPR 2021 Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu

We incorporate IMLS surface generation into deep neural networks for inheriting both the flexibility of point sets and the high quality of implicit surfaces.

3D Object Reconstruction 3D Reconstruction

Learning High-Fidelity Face Texture Completion Without Complete Face Texture

no code implementations ICCV 2021 Jongyoo Kim, Jiaolong Yang, Xin Tong

For face texture completion, previous methods typically use some complete textures captured by multiview imaging systems or 3D scanners for supervised learning.

Vocal Bursts Intensity Prediction

Bridging Cost-sensitive and Neyman-Pearson Paradigms for Asymmetric Binary Classification

1 code implementation29 Dec 2020 Wei Vivian Li, Xin Tong, Jingyi Jessica Li

In contrast, the Neyman-Pearson paradigm can train classifiers to achieve a high-probability control of the population type I error, but it relies on sample splitting that reduces the effective training sample size.

Binary Classification General Classification +1

Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence

1 code implementation CVPR 2021 Yu Deng, Jiaolong Yang, Xin Tong

We propose a novel Deformed Implicit Field (DIF) representation for modeling 3D shapes of a category and generating dense correspondences among shapes.

Field-Tuned Quantum Effects in a Triangular-Lattice Ising Magnet

no code implementations18 Nov 2020 Yayuan Qin, Yao Shen, ChangLe Liu, Hongliang Wo, Yonghao Gao, Yu Feng, Xiaowen Zhang, Gaofeng Ding, Yiqing Gu, Qisi Wang, Shoudong Shen, Helen C. Walker, Robert Bewley, Jianhui Xu, Martin Boehm, Paul Steffens, Seiko Ohira-Kawamura, Naoki Murai, Astrid Schneidewind, Xin Tong, Gang Chen, Jun Zhao

We report thermodynamic and neutron scattering measurements of the triangular-lattice quantum Ising magnet TmMgGaO 4 in longitudinal magnetic fields.

Strongly Correlated Electrons Materials Science

SkeletonNet: A Topology-Preserving Solution for Learning Mesh Reconstruction of Object Surfaces from RGB Images

1 code implementation13 Aug 2020 Jiapeng Tang, Xiaoguang Han, Mingkui Tan, Xin Tong, Kui Jia

However, they all have their own drawbacks, and cannot properly reconstruct the surface shapes of complex topologies, arguably due to a lack of constraints on the topologicalstructures in their learning frameworks.

Surface Reconstruction

Object-based Illumination Estimation with Rendering-aware Neural Networks

no code implementations ECCV 2020 Xin Wei, Guojun Chen, Yue Dong, Stephen Lin, Xin Tong

With the estimated lighting, virtual objects can be rendered in AR scenarios with shading that is consistent to the real scene, leading to improved realism.

Inverse Rendering

Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination

1 code implementation3 Aug 2020 Peng-Shuai Wang, Yu-Qi Yang, Qian-Fang Zou, Zhirong Wu, Yang Liu, Xin Tong

Although unsupervised feature learning has demonstrated its advantages to reducing the workload of data labeling and network design in many fields, existing unsupervised 3D learning methods still cannot offer a generic network for various shape analysis tasks with competitive performance to supervised methods.

3D Point Cloud Linear Classification 3D Semantic Segmentation

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

1 code implementation ECCV 2020 Ze Liu, Han Hu, Yue Cao, Zheng Zhang, Xin Tong

Our investigation reveals that despite the different designs of these operators, all of these operators make surprisingly similar contributions to the network performance under the same network input and feature numbers and result in the state-of-the-art accuracy on standard benchmarks.

3D Semantic Segmentation

Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion

1 code implementation6 Jun 2020 Peng-Shuai Wang, Yang Liu, Xin Tong

Acquiring complete and clean 3D shape and scene data is challenging due to geometric occlusion and insufficient views during 3D capturing.

Deep 3D Portrait from a Single Image

1 code implementation CVPR 2020 Sicheng Xu, Jiaolong Yang, Dong Chen, Fang Wen, Yu Deng, Yunde Jia, Xin Tong

We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images.

Face Model Stereo Matching

Imbalanced classification: a paradigm-based review

no code implementations11 Feb 2020 Yang Feng, Min Zhou, Xin Tong

For each pair of resampling techniques and classification methods, we use simulation studies and a real data set on credit card fraud to study the performance under different evaluation metrics.

Binary Classification Classification +2

Synthesizing 3D Shapes from Silhouette Image Collections using Multi-projection Generative Adversarial Networks

no code implementations CVPR 2019 Xiao Li, Yue Dong, Pieter Peers, Xin Tong

Key to our method is a novel multi-projection generative adversarial network (MP-GAN) that trains a 3D shape generator to be consistent with multiple 2D projections of the 3D shapes, and without direct access to these 3D shapes.

Weakly-supervised Learning

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set

3 code implementations20 Mar 2019 Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, Xin Tong

Recently, deep learning based 3D face reconstruction methods have shown promising results in both quality and efficiency. However, training deep neural networks typically requires a large volume of data, whereas face images with ground-truth 3D face shapes are scarce.

3D Face Reconstruction Weakly-supervised Learning

Learn a Prior for RHEA for Better Online Planning

no code implementations14 Feb 2019 Xin Tong, Weiming Liu, Bin Li

In this paper, we propose to learn a prior for RHEA in an offline manner by training a value network and a policy network.

OpenAI Gym

Image Smoothing via Unsupervised Learning

1 code implementation7 Nov 2018 Qingnan Fan, Jiaolong Yang, David Wipf, Baoquan Chen, Xin Tong

Image smoothing represents a fundamental component of many disparate computer vision and graphics applications.

Image Manipulation image smoothing

Adaptive O-CNN: A Patch-based Deep Representation of 3D Shapes

1 code implementation21 Sep 2018 Peng-Shuai Wang, Chun-Yu Sun, Yang Liu, Xin Tong

The Adaptive O-CNN encoder takes the planar patch normal and displacement as input and performs 3D convolutions only at the octants at each level, while the Adaptive O-CNN decoder infers the shape occupancy and subdivision status of octants at each level and estimates the best plane normal and displacement for each leaf octant.

Deep Single-View 3D Object Reconstruction with Visual Hull Embedding

1 code implementation10 Sep 2018 Hanqing Wang, Jiaolong Yang, Wei Liang, Xin Tong

The key idea of our method is to leverage object mask and pose estimation from CNNs to assist the 3D shape learning by constructing a probabilistic single-view visual hull inside of the network.

3D Object Reconstruction Pose Estimation

PFCNN: Convolutional Neural Networks on 3D Surfaces Using Parallel Frames

1 code implementation CVPR 2020 Yu-Qi Yang, Shilin Liu, Hao Pan, Yang Liu, Xin Tong

Surface meshes are widely used shape representations and capture finer geometry data than point clouds or volumetric grids, but are challenging to apply CNNs directly due to their non-Euclidean structure.

Ranked #21 on Semantic Segmentation on ScanNet (test mIoU metric)

Scene Segmentation

View-volume Network for Semantic Scene Completion from a Single Depth Image

no code implementations14 Jun 2018 Yu-Xiao Guo, Xin Tong

We introduce a View-Volume convolutional neural network (VVNet) for inferring the occupancy and semantic labels of a volumetric 3D scene from a single depth image.

3D Semantic Scene Completion

Neyman-Pearson classification: parametrics and sample size requirement

no code implementations7 Feb 2018 Xin Tong, Lucy Xia, Jiacheng Wang, Yang Feng

In this work, we employ the parametric linear discriminant analysis (LDA) model and propose a new parametric thresholding algorithm, which does not need the minimum sample size requirements on class $0$ observations and thus is suitable for small sample applications such as rare disease diagnosis.

Binary Classification Classification +3

Intentional Control of Type I Error over Unconscious Data Distortion: a Neyman-Pearson Approach to Text Classification

no code implementations7 Feb 2018 Lucy Xia, Richard Zhao, Yanhui Wu, Xin Tong

To deal with inestimable data distortion, we propose the use of the Neyman-Pearson (NP) classification paradigm, which minimizes type II error under a user-specified type I error constraint.

General Classification text-classification +1

Demography-based Facial Retouching Detection using Subclass Supervised Sparse Autoencoder

no code implementations22 Sep 2017 Aparna Bharati, Mayank Vatsa, Richa Singh, Kevin W. Bowyer, Xin Tong

However, previous work on this topic has not considered whether or how accuracy of retouching detection varies with the demography of face images.

Mesh Denoising via Cascaded Normal Regression

no code implementations15 Nov 2016 Peng-Shuai Wang, Yang Liu, Xin Tong

At runtime, our method applies the learned cascaded regression functions to a noisy input mesh and reconstructs the denoised mesh from the output facet normals.

Denoising regression

Neyman-Pearson Classification under High-Dimensional Settings

no code implementations13 Aug 2015 Anqi Zhao, Yang Feng, Lie Wang, Xin Tong

Most existing binary classification methods target on the optimization of the overall classification risk and may fail to serve some real-world applications such as cancer diagnosis, where users are more concerned with the risk of misclassifying one specific class than the other.

Binary Classification Classification +4

Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification

no code implementations31 Dec 2013 Jianqing Fan, Yang Feng, Jiancheng Jiang, Xin Tong

We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities.

Additive models General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.