Search Results for author: Xiaoguang Han

Found 95 papers, 29 papers with code

SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation

no code implementations • 8 Apr 2024 • Heyuan Li, Ce Chen, Tianhao Shi, Yuda Qiu, Sizhe An, GuanYing Chen, Xiaoguang Han

We further introduce a view-image consistency loss for the discriminator to emphasize the correspondence of the camera parameters and the images.

Face Generation

Paper
Add Code

GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond

1 code implementation • 28 Mar 2024 • Chongjie Ye, Yinyu Nie, Jiahao Chang, Yuantao Chen, YiHao Zhi, Xiaoguang Han

We present GauStudio, a novel modular framework for modeling 3D Gaussian Splatting (3DGS) to provide standardized, plug-and-play components for users to easily customize and implement a 3DGS pipeline.

Novel View Synthesis Surface Reconstruction

651

Paper
Code

LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset

no code implementations • 19 Dec 2023 • Haolin Liu, Chongjie Ye, Yinyu Nie, Yingfan He, Xiaoguang Han

Instance shape reconstruction from a 3D scene involves recovering the full geometries of multiple objects at the semantic instance level.

3D Object Detection Object +1

Paper
Add Code

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

no code implementations • 7 Dec 2023 • Shuliang Ning, Duomin Wang, Yipeng Qin, Zirong Jin, Baoyuan Wang, Xiaoguang Han

Unlike prior arts constrained by specific input types, our method allows flexible specification of style (text or image) and texture (full garment, cropped sections, or texture patches) conditions.

Disentanglement Human Parsing +1

Paper
Add Code

MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures

no code implementations • 5 Dec 2023 • Zhangyang Xiong, Chenghong Li, Kenkun Liu, Hongjie Liao, Jianqiao Hu, Junyi Zhu, Shuliang Ning, Lingteng Qiu, Chongjie Wang, Shijie Wang, Shuguang Cui, Xiaoguang Han

In this era, the success of large language models and text-to-image models can be attributed to the driving force of large-scale datasets.

Action Recognition Image Generation

Paper
Add Code

SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation

1 code implementation • 29 Nov 2023 • Mutian Xu, Xingyilang Yin, Lingteng Qiu, Yang Liu, Xin Tong, Xiaoguang Han

We introduce SAMPro3D for zero-shot 3D indoor scene segmentation.

Scene Segmentation Scene Understanding +1

Paper
Code

RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D

no code implementations • 28 Nov 2023 • Lingteng Qiu, GuanYing Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han

Lifting 2D diffusion for 3D generation is a challenging problem due to the lack of geometric prior and the complex entanglement of materials and lighting in natural images.

3D Generation Text to 3D

Paper
Add Code

HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images

no code implementations • 27 Nov 2023 • Xihe Yang, Xingyu Chen, Daiheng Gao, Shaohui Wang, Xiaoguang Han, Baoyuan Wang

As for human avatar reconstruction, contemporary techniques commonly necessitate the acquisition of costly data and struggle to achieve satisfactory results from a small number of casual images.

Paper
Add Code

FIRST: A Million-Entry Dataset for Text-Driven Fashion Synthesis and Design

no code implementations • 13 Nov 2023 • Zhen Huang, Yihao Li, Dong Pei, Jiapeng Zhou, Xuliang Ning, Jianlin Han, Xiaoguang Han, Xuejun Chen

Text-driven fashion synthesis and design is an extremely valuable part of artificial intelligence generative content(AIGC), which has the potential to propel a tremendous revolution in the traditional fashion industry.

Fashion Synthesis

Paper
Add Code

SeamlessNeRF: Stitching Part NeRFs with Gradient Propagation

no code implementations • 30 Oct 2023 • Bingchen Gong, Yuehao Wang, Xiaoguang Han, Qi Dou

To fill this gap, we propose SeamlessNeRF, a novel approach for seamless appearance blending of multiple NeRFs.

Paper
Add Code

Activate and Reject: Towards Safe Domain Generalization under Category Shift

no code implementations • ICCV 2023 • Chaoqi Chen, Luyao Tang, Leitian Tao, Hong-Yu Zhou, Yue Huang, Xiaoguang Han, Yizhou Yu

Albeit the notable performance on in-domain test points, it is non-trivial for deep neural networks to attain satisfactory accuracy when deploying in the open world, where novel domains and object classes often occur.

Domain Generalization Image Classification +3

Paper
Add Code

EMS: 3D Eyebrow Modeling from Single-view Images

no code implementations • 22 Sep 2023 • Chenghong Li, Leyang Jin, Yujian Zheng, Yizhou Yu, Xiaoguang Han

Three modules are then carefully designed: RootFinder firstly localizes the fiber root positions which indicates where to grow; OriPredictor predicts an orientation field in the 3D space to guide the growing of fibers; FiberEnder is designed to determine when to stop the growth of each fiber.

Paper
Add Code

Efficient View Synthesis with Neural Radiance Distribution Field

no code implementations • ICCV 2023 • Yushuang Wu, Xiao Li, Jinglu Wang, Xiaoguang Han, Shuguang Cui, Yan Lu

Specifically, we use a small network similar to NeRF while preserving the rendering speed with a single network forwarding per pixel as in NeLF.

Paper
Add Code

Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks

no code implementations • 13 Aug 2023 • David Junhao Zhang, Mutian Xu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou

Despite the rapid advancement of unsupervised learning in visual representation, it requires training on large-scale datasets that demand costly data collection, and pose additional challenges due to concerns regarding data privacy.

Contrastive Learning Image Classification +2

Paper
Add Code

Universal Semi-supervised Model Adaptation via Collaborative Consistency Training

no code implementations • 7 Jul 2023 • Zizheng Yan, Yushuang Wu, Yipeng Qin, Xiaoguang Han, Shuguang Cui, Guanbin Li

In this paper, we introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA), which i) requires only a pre-trained source model, ii) allows the source and target domain to have different label sets, i. e., they share a common label set and hold their own private label set, and iii) requires only a few labeled samples in each class of the target domain.

Domain Adaptation

Paper
Add Code

SketchMetaFace: A Learning-based Sketching Interface for High-fidelity 3D Character Face Modeling

no code implementations • 3 Jul 2023 • Zhongjin Luo, Dong Du, Heming Zhu, Yizhou Yu, Hongbo Fu, Xiaoguang Han

User studies demonstrate the superiority of our system over existing modeling tools in terms of the ease to use and visual quality of results.

Paper
Add Code

3D Keypoint Estimation Using Implicit Representation Learning

no code implementations • 20 Jun 2023 • Xiangyu Zhu, Dong Du, Haibin Huang, Chongyang Ma, Xiaoguang Han

Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints.

Keypoint Estimation Representation Learning

Paper
Add Code

AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets

no code implementations • 16 Jun 2023 • Yu Lu, Junwei Bao, Zichen Ma, Xiaoguang Han, Youzheng Wu, Shuguang Cui, Xiaodong He

High-quality data is essential for conversational recommendation systems and serves as the cornerstone of the network architecture development and training strategy design.

Knowledge Graphs Recommendation Systems

Paper
Add Code

From NeRFLiX to NeRFLiX++: A General NeRF-Agnostic Restorer Paradigm

1 code implementation • 10 Jun 2023 • Kun Zhou, Wenbo Li, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu

To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm that learns a degradation-driven inter-viewpoint mixer.

Ranked #1 on Novel View Synthesis on Tanks and Temples

Computational Efficiency Novel View Synthesis

Paper
Code

REC-MV: REconstructing 3D Dynamic Cloth from Monocular Videos

1 code implementation • CVPR 2023 • Lingteng Qiu, GuanYing Chen, Jiapeng Zhou, Mutian Xu, Junle Wang, Xiaoguang Han

To address the above limitations, in this paper, we formulate this task as an optimization problem of 3D garment feature curves and surface reconstruction from monocular video.

Garment Reconstruction Neural Rendering +1

273

Paper
Code

FashionTex: Controllable Virtual Try-on with Text and Texture

1 code implementation • 8 May 2023 • Anran Lin, Nanxuan Zhao, Shuliang Ning, Yuda Qiu, Baoyuan Wang, Xiaoguang Han

Virtual try-on attracts increasing research attention as a promising way for enhancing the user experience for online cloth shopping.

Virtual Try-on

Paper
Code

SCoDA: Domain Adaptive Shape Completion for Real Scans

1 code implementation • CVPR 2023 • Yushuang Wu, Zizheng Yan, Ce Chen, Lai Wei, Xiao Li, Guanbin Li, Yihao Li, Shuguang Cui, Xiaoguang Han

Thus, we propose a new task, SCoDA, for the domain adaptation of real scan shape completion from synthetic data.

Benchmarking Domain Adaptation +1

182

Paper
Code

NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud

no code implementations • CVPR 2023 • Xiangyu Zhu, Dong Du, Weikai Chen, Zhiyou Zhao, Yinyu Nie, Xiaoguang Han

We show that a simple network based on NerVE can already outperform the previous state-of-the-art methods by a great margin.

Keypoint Detection

Paper
Add Code

RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset

no code implementations • CVPR 2023 • Zhongjin Luo, Shengcai Cai, Jinguo Dong, Ruibo Ming, Liangdong Qiu, Xiaohang Zhan, Xiaoguang Han

However, none of the prior works focus on modeling 3D biped cartoon characters, which are also in great demand in gaming and filming.

Paper
Add Code

NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer

1 code implementation • CVPR 2023 • Kun Zhou, Wenbo Li, Yi Wang, Tao Hu, Nianjuan Jiang, Xiaoguang Han, Jiangbo Lu

Neural radiance fields (NeRF) show great success in novel view synthesis.

Ranked #1 on Novel View Synthesis on LLFF

Neural Rendering Novel View Synthesis

Paper
Code

MVImgNet: A Large-scale Dataset of Multi-view Images

no code implementations • CVPR 2023 • Xianggang Yu, Mutian Xu, Yidan Zhang, Haolin Liu, Chongjie Ye, Yushuang Wu, Zizheng Yan, Chenming Zhu, Zhangyang Xiong, Tianyou Liang, GuanYing Chen, Shuguang Cui, Xiaoguang Han

The birth of ImageNet drives a remarkable trend of "learning from large-scale data" in computer vision.

3D Object Classification Attribute

Paper
Add Code

HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling

1 code implementation • CVPR 2023 • Yujian Zheng, Zirong Jin, Moran Li, Haibin Huang, Chongyang Ma, Shuguang Cui, Xiaoguang Han

We firmly think an intermediate representation is essential, but we argue that orientation map using the dominant filtering-based methods is sensitive to uncertain noise and far from a competent representation.

154

Paper
Code

Get3DHuman: Lifting StyleGAN-Human into a 3D Generative Model using Pixel-aligned Reconstruction Priors

no code implementations • ICCV 2023 • Zhangyang Xiong, Di Kang, Derong Jin, Weikai Chen, Linchao Bao, Shuguang Cui, Xiaoguang Han

Specifically, we bridge the latent space of Get3DHuman with that of StyleGAN-Human via a specially-designed prior network, where the input latent code is mapped to the shape and texture feature volumes spanned by the pixel-aligned 3D reconstructor.

Paper
Add Code

RecolorNeRF: Layer Decomposed Radiance Fields for Efficient Color Editing of 3D Scenes

no code implementations • 19 Jan 2023 • Bingchen Gong, Yuehao Wang, Xiaoguang Han, Qi Dou

We present RecolorNeRF, a novel user-friendly color editing approach for the neural radiance fields.

Color Manipulation

Paper
Add Code

Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework

1 code implementation • 20 Dec 2022 • Wei Lou, Haofeng Li, Guanbin Li, Xiaoguang Han, Xiang Wan

Recently deep neural networks, which require a large amount of annotated samples, have been widely applied in nuclei instance segmentation of H\&E stained pathology images.

Instance Segmentation Segmentation +1

Paper
Code

MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling with Informative-Preserved Reconstruction and Self-Distilled Consistency

no code implementations • CVPR 2023 • Mingye Xu, Mutian Xu, Tong He, Wanli Ouyang, Yali Wang, Xiaoguang Han, Yu Qiao

Besides, such scenes with progressive masking ratios can also serve to self-distill their intrinsic spatial consistency, requiring to learn the consistent representations from unmasked areas.

object-detection Object Detection +2

Paper
Add Code

MIMO Is All You Need : A Strong Multi-In-Multi-Out Baseline for Video Prediction

1 code implementation • 9 Dec 2022 • Shuliang Ning, Mengcheng Lan, Yanran Li, Chaofeng Chen, Qian Chen, Xunlai Chen, Xiaoguang Han, Shuguang Cui

The mainstream of the existing approaches for video prediction builds up their models based on a Single-In-Single-Out (SISO) architecture, which takes the current frame as input to predict the next frame in a recursive manner.

Video Prediction

Paper
Code

Learning 3D Scene Priors with 2D Supervision

no code implementations • CVPR 2023 • Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner

Holistic 3D scene understanding entails estimation of both layout configuration and object geometry in a 3D environment.

Scene Understanding

Paper
Add Code

Mutual Guidance and Residual Integration for Image Enhancement

no code implementations • 25 Nov 2022 • Kun Zhou, Kenkun Liu, Wenbo Li, Xiaoguang Han, Jiangbo Lu

To address those issues, we propose a novel mutual guidance network (MGN) to perform effective bidirectional global-local information exchange while keeping a compact architecture.

Computational Efficiency Image Enhancement +1

Paper
Add Code

Point Cloud Scene Completion with Joint Color and Semantic Estimation from Single RGB-D Image

no code implementations • 12 Oct 2022 • Zhaoxuan Zhang, Xiaoguang Han, Bo Dong, Tong Li, BaoCai Yin, Xin Yang

Given a single RGB-D image, our method first predicts its semantic segmentation map and goes through the 3D volume branch to obtain a volumetric scene reconstruction as a guide to the next view inpainting step, which attempts to make up the missing information; the third step involves projecting the volume under the same view of the input, concatenating them to complete the current view RGB-D and segmentation map, and integrating all RGB-D and segmentation maps into the point cloud.

Image Inpainting Segmentation +1

Paper
Add Code

A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective

no code implementations • 27 Sep 2022 • Chaoqi Chen, Yushuang Wu, Qiyuan Dai, Hong-Yu Zhou, Mutian Xu, Sibei Yang, Xiaoguang Han, Yizhou Yu

Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (\emph{e. g.,} social network analysis and recommender systems), computer vision (\emph{e. g.,} object detection and point cloud learning), and natural language processing (\emph{e. g.,} relation extraction and sequence learning), to name a few.

Graph Representation Learning object-detection +3

Paper
Add Code

PIFu for the Real World: A Self-supervised Framework to Reconstruct Dressed Human from Single-view Images

no code implementations • 23 Aug 2022 • Zhangyang Xiong, Dong Du, Yushuang Wu, Jingqi Dong, Di Kang, Linchao Bao, Xiaoguang Han

On synthetic data, our Intersection-Over-Union (IoU) achieves to 93. 5%, 18% higher compared with PIFuHD.

Self-Supervised Learning

Paper
Add Code

Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes

1 code implementation • 18 Jul 2022 • Haolin Liu, Yujian Zheng, GuanYing Chen, Shuguang Cui, Xiaoguang Han

We present a new framework to reconstruct holistic 3D indoor scenes including both room background and indoor objects from single-view images.

Object Reconstruction Vocal Bursts Intensity Prediction

Paper
Code

Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection

no code implementations • 6 Jun 2022 • Chaoqi Chen, Jiongcheng Li, Hong-Yu Zhou, Xiaoguang Han, Yue Huang, Xinghao Ding, Yizhou Yu

However, both the global and local alignment approaches fail to capture the topological relations among different foreground objects as the explicit dependencies and interactions between and within domains are neglected.

Domain Adaptation Graph Attention +5

Paper
Add Code

Multi-level Consistency Learning for Semi-supervised Domain Adaptation

1 code implementation • 9 May 2022 • Zizheng Yan, Yushuang Wu, Guanbin Li, Yipeng Qin, Xiaoguang Han, Shuguang Cui

Semi-supervised domain adaptation (SSDA) aims to apply knowledge learned from a fully labeled source domain to a scarcely labeled target domain.

Ranked #1 on Semi-supervised Domain Adaptation on VisDA2017

Domain Adaptation Semi-supervised Domain Adaptation

Paper
Code

DArch: Dental Arch Prior-assisted 3D Tooth Instance Segmentation

no code implementations • 25 Apr 2022 • Liangdong Qiu, Chongjie Ye, Pei Chen, Yunbi Liu, Xiaoguang Han, Shuguang Cui

Experimental results on $4, 773$ dental models have shown our DArch can accurately segment each tooth of a dental model, and its performance is superior to the state-of-the-art methods.

Instance Segmentation Segmentation +1

Paper
Add Code

Registering Explicit to Implicit: Towards High-Fidelity Garment mesh Reconstruction from Single Images

no code implementations • CVPR 2022 • Heming Zhu, Lingteng Qiu, Yuda Qiu, Xiaoguang Han

Fueled by the power of deep learning techniques and implicit shape learning, recent advances in single-image human digitalization have reached unprecedented accuracy and could recover fine-grained surface details such as garment wrinkles.

Garment Reconstruction

Paper
Add Code

Compound Domain Generalization via Meta-Knowledge Encoding

no code implementations • CVPR 2022 • Chaoqi Chen, Jiongcheng Li, Xiaoguang Han, Xiaoqing Liu, Yizhou Yu

Such holistic semantic structure, referred to as meta-knowledge here, is crucial for learning generalizable representations.

Domain Generalization Out-of-Distribution Generalization

Paper
Add Code

SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation

no code implementations • CVPR 2022 • Chenming Zhu, Xuanye Zhang, Yanran Li, Liangdong Qiu, Kai Han, Xiaoguang Han

Contour-based models are efficient and generic to be incorporated with any existing segmentation methods, but they often generate over-smoothed contour and tend to fail on corner areas.

Instance Segmentation Segmentation +1

Paper
Add Code

Exploring Motion Ambiguity and Alignment for High-Quality Video Frame Interpolation

no code implementations • CVPR 2023 • Kun Zhou, Wenbo Li, Xiaoguang Han, Jiangbo Lu

Without the bells and whistles, our plug-and-play TCL is capable of improving the performance of existing VFI frameworks.

Ranked #1 on Video Frame Interpolation on Middlebury (PSNR metric)

Optical Flow Estimation Video Frame Interpolation +1

Paper
Add Code

TO-Scene: A Large-scale Dataset for Understanding 3D Tabletop Scenes

1 code implementation • 17 Mar 2022 • Mutian Xu, Pei Chen, Haolin Liu, Xiaoguang Han

Experiments show that the algorithms trained on TO-Scene indeed work on the realistic test data, and our proposed tabletop-aware learning strategy greatly improves the state-of-the-art results on both 3D semantic segmentation and object detection tasks.

3D Semantic Segmentation object-detection +2

Paper
Code

Real-World Blind Super-Resolution via Feature Matching with Implicit High-Resolution Priors

2 code implementations • 26 Feb 2022 • Chaofeng Chen, Xinyu Shi, Yipeng Qin, Xiaoming Li, Xiaoguang Han, Tao Yang, Shihui Guo

Unlike image-space methods, our FeMaSR restores HR images by matching distorted LR image {\it features} to their distortion-free HR counterparts in our pretrained HR priors, and decoding the matched features to obtain realistic HR images.

Blind Super-Resolution Generative Adversarial Network +2

187

Paper
Code

PointMatch: A Consistency Training Framework for Weakly Supervised Semantic Segmentation of 3D Point Clouds

no code implementations • 22 Feb 2022 • Yushuang Wu, Zizheng Yan, Shengcai Cai, Guanbin Li, Yizhou Yu, Xiaoguang Han, Shuguang Cui

Semantic segmentation of point cloud usually relies on dense annotation that is exhausting and costly, so it attracts wide attention to investigate solutions for the weakly supervised scheme with only sparse points annotated.

Representation Learning Weakly supervised Semantic Segmentation +1

Paper
Add Code

PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for Single-Image Novel View Synthesis

no code implementations • 10 Feb 2022 • Xianggang Yu, Jiapeng Tang, Yipeng Qin, Chenghong Li, Linchao Bao, Xiaoguang Han, Shuguang Cui

We present PVSeRF, a learning framework that reconstructs neural radiance fields from single-view RGB images, for novel view synthesis.

Disentanglement Novel View Synthesis

Paper
Add Code

Expressive Talking Head Generation With Granular Audio-Visual Control

no code implementations • CVPR 2022 • Borong Liang, Yan Pan, Zhizhi Guo, Hang Zhou, Zhibin Hong, Xiaoguang Han, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Generating expressive talking heads is essential for creating virtual humans.

Talking Head Generation

Paper
Add Code

DArch: Dental Arch Prior-Assisted 3D Tooth Instance Segmentation With Weak Annotations

no code implementations • CVPR 2022 • Liangdong Qiu, Chongjie Ye, Pei Chen, Yunbi Liu, Xiaoguang Han, Shuguang Cui

Experimental results on 4, 773 dental models have shown our DArch can accurately segment each tooth of a dental model, and its performance is superior to the state-of-the-art methods.

Instance Segmentation Segmentation +1

Paper
Add Code

ETHSeg: An Amodel Instance Segmentation Network and a Real-World Dataset for X-Ray Waste Inspection

no code implementations • CVPR 2022 • Lingteng Qiu, Zhangyang Xiong, Xuhao Wang, Kenkun Liu, Yihan Li, GuanYing Chen, Xiaoguang Han, Shuguang Cui

Inspired by the fact that X-ray has a strong penetrating power to see through the bag and overlapping objects, we propose to perform waste inspection efficiently using X-ray images without the need to open the bag.

Instance Segmentation Segmentation +1

Paper
Add Code

Pose2Room: Understanding 3D Scenes from Human Activities

no code implementations • 1 Dec 2021 • Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nießner

To this end, we propose P2R-Net to learn a probabilistic 3D model of the objects in a scene characterized by their class categories and oriented 3D bounding boxes, based on an input observed human trajectory in the environment.

Object

Paper
Add Code

Revisiting Temporal Alignment for Video Restoration

1 code implementation • CVPR 2022 • Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, Jiangbo Lu

Long-range temporal alignment is critical yet challenging for video restoration tasks.

Ranked #1 on Video Super-Resolution on Vimeo-90K

Deblurring Denoising +3

Paper
Code

Pixel-level Intra-domain Adaptation for Semantic Segmentation

no code implementations • ACM International Conference on Multimedia 2021 • Zizheng Yan, Xianggang Yu, Yipeng Qin, Yushuang Wu, Xiaoguang Han, Shuguang Cui

Recent advances in unsupervised domain adaptation have achieved remarkable performance on semantic segmentation tasks.

Ranked #29 on Synthetic-to-Real Translation on GTAV-to-Cityscapes Labels

Segmentation Semantic Segmentation +2

Paper
Add Code

SketchHairSalon: Deep Sketch-based Hair Image Synthesis

no code implementations • 16 Sep 2021 • Chufeng Xiao, Deng Yu, Xiaoguang Han, Youyi Zheng, Hongbo Fu

At the second stage, another network is trained to synthesize the structure and appearance of hair images from the input sketch and the generated matte.

Image Generation

Paper
Add Code

Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts

2 code implementations • ICCV 2021 • Hong-Yu Zhou, Chixiang Lu, Sibei Yang, Xiaoguang Han, Yizhou Yu

From this perspective, we introduce Preservational Learning to reconstruct diverse image contexts in order to preserve more information in learned representations.

Contrastive Learning Representation Learning +1

105

Paper
Code

ME-PCN: Point Completion Conditioned on Mask Emptiness

1 code implementation • ICCV 2021 • Bingchen Gong, Yinyu Nie, Yiqun Lin, Xiaoguang Han, Yizhou Yu

Main-stream methods predict the missing shapes by decoding a global feature learned from the input point cloud, which often leads to deficient results in preserving topology consistency and surface details.

Paper
Code

SimpModeling: Sketching Implicit Field to Guide Mesh Modeling for 3D Animalmorphic Head Design

1 code implementation • 5 Aug 2021 • Zhongjin Luo, Jie zhou, Heming Zhu, Dong Du, Xiaoguang Han, Hongbo Fu

In this work, we propose SimpModeling, a novel sketch-based system for helping users, especially amateur users, easily model 3D animalmorphic heads - a prevalent kind of heads in character design.

Paper
Code

From Single to Multiple: Leveraging Multi-level Prediction Spaces for Video Forecasting

no code implementations • 21 Jul 2021 • Mengcheng Lan, Shuliang Ning, Yanran Li, Qian Chen, Xunlai Chen, Xiaoguang Han, Shuguang Cui

Despite video forecasting has been a widely explored topic in recent years, the mainstream of the existing work still limits their models with a single prediction space but completely neglects the way to leverage their model with multi-prediction spaces.

Video Prediction

Paper
Add Code

Transformer with Peak Suppression and Knowledge Guidance for Fine-grained Image Recognition

no code implementations • 14 Jul 2021 • Xinda Liu, Lili Wang, Xiaoguang Han

In this paper, we analyze the difficulties of fine-grained image recognition from a new perspective and propose a transformer architecture with the peak suppression module and knowledge guidance module, which respects the diversification of discriminative features in a single image and the aggregation of discriminative clues among multiple images.

Ranked #6 on Fine-Grained Image Classification on Stanford Dogs

Fine-Grained Image Classification Fine-Grained Image Recognition

Paper
Add Code

Task-Aware Sampling Layer for Point-Wise Analysis

no code implementations • 9 Jul 2021 • Yiqun Lin, Lichang Chen, Haibin Huang, Chongyang Ma, Xiaoguang Han, Shuguang Cui

Sampling, grouping, and aggregation are three important components in the multi-scale analysis of point clouds.

Keypoint Detection Point Cloud Completion +1

Paper
Add Code

Hepatocellular Carcinoma Segmentation from Digital Subtraction Angiography Videos using Learnable Temporal Difference

no code implementations • 9 Jul 2021 • Wenting Jiang, Yicheng Jiang, Lu Zhang, Changmiao Wang, Xiaoguang Han, Shuixing Zhang, Xiang Wan, Shuguang Cui

In this paper, we raise the problem of HCC segmentation in DSA videos, and build our own DSA dataset.

Segmentation

Paper
Add Code

3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction

no code implementations • CVPR 2021 • Yuda Qiu, Xiaojie Xu, Lingteng Qiu, Yan Pan, Yushuang Wu, Weikai Chen, Xiaoguang Han

Caricature is an artistic representation that deliberately exaggerates the distinctive features of a human face to convey humor or sarcasm.

Caricature Face Reconstruction

Paper
Add Code

Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images

2 code implementations • CVPR 2021 • Haolin Liu, Anran Lin, Xiaoguang Han, Lei Yang, Yizhou Yu, Shuguang Cui

Grounding referring expressions in RGBD image has been an emerging field.

Object Visual Grounding

Paper
Code

LapsCore: Language-Guided Person Search via Color Reasoning

no code implementations • ICCV 2021 • Yushuang Wu, Zizheng Yan, Xiaoguang Han, Guanbin Li, Changqing Zou, Shuguang Cui

The key point of language-guided person search is to construct the cross-modal association between visual and textual input.

Colorization Image Colorization +2

Paper
Add Code

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

1 code implementation • CVPR 2021 • Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner

In this work, we introduce RfD-Net that jointly detects and reconstructs dense object surfaces directly from raw point clouds.

Object object-detection +4

197

Paper
Code

JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition

no code implementations • 16 Nov 2020 • Jinmiao Cai, Nianjuan Jiang, Xiaoguang Han, Kui Jia, Jiangbo Lu

Skeleton-based action recognition has attracted research attentions in recent years.

Action Recognition Optical Flow Estimation +2

Paper
Add Code

Skeleton-bridged Point Completion: From Global Inference to Local Adjustment

no code implementations • NeurIPS 2020 • Yinyu Nie, Yiqun Lin, Xiaoguang Han, Shihui Guo, Jian Chang, Shuguang Cui, Jian Jun Zhang

Existing works usually estimate the missing shape by decoding a latent feature encoded from the input points.

Surface Reconstruction

Paper
Add Code

A deep learning based interactive sketching system for fashion images design

no code implementations • 9 Oct 2020 • Yao Li, Xianggang Yu, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu

In this work, we propose an interactive system to design diverse high-quality garment images from fashion sketches and the texture information.

Intrinsic Image Decomposition Texture Synthesis

Paper
Add Code

Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos

no code implementations • 18 Sep 2020 • Jie Wu, Guanbin Li, Xiaoguang Han, Liang Lin

Temporal grounding of natural language in untrimmed videos is a fundamental yet challenging multimedia task facilitating cross-media visual content retrieval.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Ultrasound Liver Fibrosis Diagnosis using Multi-indicator guided Deep Neural Networks

no code implementations • 10 Sep 2020 • Jiali Liu, Wenxuan Wang, Tianyao Guan, Ningbo Zhao, Xiaoguang Han, Zhen Li

An indicator-guided learning mechanism is further proposed to ease the training of the proposed model.

Paper
Add Code

SkeletonNet: A Topology-Preserving Solution for Learning Mesh Reconstruction of Object Surfaces from RGB Images

1 code implementation • 13 Aug 2020 • Jiapeng Tang, Xiaoguang Han, Mingkui Tan, Xin Tong, Kui Jia

However, they all have their own drawbacks, and cannot properly reconstruct the surface shapes of complex topologies, arguably due to a lack of constraints on the topologicalstructures in their learning frameworks.

Surface Reconstruction

Paper
Code

Deep Fashion3D: A Dataset and Benchmark for 3D Garment Reconstruction from Single Images

2 code implementations • ECCV 2020 • Heming Zhu, Yu Cao, Hang Jin, Weikai Chen, Dong Du, Zhangye Wang, Shuguang Cui, Xiaoguang Han

High-fidelity clothing reconstruction is the key to achieving photorealism in a wide range of applications including human digitization, virtual try-on, etc.

Garment Reconstruction Virtual Try-on

410

Paper
Code

Learning Inverse Rendering of Faces from Real-world Videos

1 code implementation • 26 Mar 2020 • Yuda Qiu, Zhangyang Xiong, Kai Han, Zhongyuan Wang, Zixiang Xiong, Xiaoguang Han

To alleviate this problem, we propose a weakly supervised training approach to train our model on real face videos, based on the assumption of consistency of albedo and normal across different frames, thus bridging the gap between real and synthetic face images.

Inverse Rendering

Paper
Code

Peeking into occluded joints: A novel framework for crowd pose estimation

1 code implementation • ECCV 2020 • Lingteng Qiu, Xuanye Zhang, Yan-ran Li, Guanbin Li, Xiao-Jun Wu, Zixiang Xiong, Xiaoguang Han, Shuguang Cui

Although occlusion widely exists in nature and remains a fundamental challenge for pose estimation, existing heatmap-based approaches suffer serious degradation on occlusions.

Pose Estimation

129

Paper
Code

HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose and Shape Estimation

1 code implementation • 10 Mar 2020 • Kun Zhou, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu

Estimating 3D human pose from a single image is a challenging task.

3D human pose and shape estimation

Paper
Code

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

1 code implementation • CVPR 2020 • Yinyu Nie, Xiaoguang Han, Shihui Guo, Yujian Zheng, Jian Chang, Jian Jun Zhang

Semantic reconstruction of indoor scenes refers to both scene understanding and object reconstruction.

Ranked #2 on 3D Shape Reconstruction on Pix3D

3D Shape Reconstruction Monocular 3D Object Detection +5

403

Paper
Code

FPConv: Learning Local Flattening for Point Convolution

1 code implementation • CVPR 2020 • Yiqun Lin, Zizheng Yan, Haibin Huang, Dong Du, Ligang Liu, Shuguang Cui, Xiaoguang Han

We introduce FPConv, a novel surface-style convolution operator designed for 3D point cloud analysis.

3D Object Classification Scene Segmentation

132

Paper
Code

Shallow2Deep: Indoor Scene Modeling by Single Image Understanding

no code implementations • 22 Feb 2020 • Yinyu Nie, Shihui Guo, Jian Chang, Xiaoguang Han, Jiahui Huang, Shi-Min Hu, Jian Jun Zhang

Particularly, we design a shallow-to-deep architecture on the basis of convolutional networks for semantic scene understanding and modeling.

Relation Network Scene Understanding

Paper
Add Code

Self-Enhanced Convolutional Network for Facial Video Hallucination

no code implementations • 23 Nov 2019 • Chaowei Fang, Guanbin Li, Xiaoguang Han, Yizhou Yu

It further recurrently exploits the reconstructed results and intermediate features of a sequence of preceding frames to improve the initial super-resolution of the current frame by modelling the coherence of structural facial features across frames.

Hallucination Video Super-Resolution

Paper
Add Code

HEMlets Pose: Learning Part-Centric Heatmap Triplets for Accurate 3D Human Pose Estimation

no code implementations • ICCV 2019 • Kun Zhou, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu

Estimating 3D human pose from a single image is a challenging task.

Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (Use Video Sequence metric, using extra training data)

Monocular 3D Human Pose Estimation

Paper
Add Code

Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks

no code implementations • ICCV 2019 • Junyi Pan, Xiaoguang Han, Weikai Chen, Jiapeng Tang, Kui Jia

The key to our approach is a novel progressive shaping framework that alternates between mesh deformation and topology modification.

Ranked #3 on 3D Shape Reconstruction on Pix3D

3D Shape Reconstruction

Paper
Add Code

A Skeleton-bridged Deep Learning Approach for Generating Meshesof Complex Topologies from Single RGB Image

1 code implementation • CVPR 2019 2019 • Jiapeng Tang, Xiaoguang Han, Junyi Pan, Kui Jia, Xin Tong

To this end, we propose in this paper a skeleton-bridged, stage-wise learning approach to address the challenge.

Paper
Code

A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images

1 code implementation • CVPR 2019 • Jiapeng Tang, Xiaoguang Han, Junyi Pan, Kui Jia, Xin Tong

To this end, we propose in this paper a skeleton-bridged, stage-wise learning approach to address the challenge.

Paper
Code

Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image

no code implementations • CVPR 2019 • Xiaoguang Han, Zhaoxuan Zhang, Dong Du, Mingdai Yang, Jingming Yu, Pan Pan, Xin Yang, Ligang Liu, Zixiang Xiong, Shuguang Cui

Given a single depth image, our method first goes through the 3D volume branch to obtain a volumetric scene reconstruction as a guide to the next view inpainting step, which attempts to make up the missing information; the third step involves projecting the volume under the same view of the input, concatenating them to complete the current view depth, and integrating all depth into the point cloud.

Paper
Add Code

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

no code implementations • 28 Feb 2019 • Haonan Qiu, Chuan Wang, Hang Zhu, Xiangyu Zhu, Jinjin Gu, Xiaoguang Han

Generating plausible hair image given limited guidance, such as sparse sketches or low-resolution image, has been made possible with the rise of Generative Adversarial Networks (GANs).

Image-to-Image Translation Super-Resolution +2

Paper
Add Code

Learning Mutually Local-global U-nets For High-resolution Retinal Lesion Segmentation in Fundus Images

no code implementations • 18 Jan 2019 • Zizheng Yan, Xiaoguang Han, Changmiao Wang, Yuda Qiu, Zixiang Xiong, Shuguang Cui

Due to high-resolution and small-size lesion regions, applying existing methods, such as U-Nets, to perform segmentation on fundus photography is very challenging.

Lesion Segmentation Segmentation

Paper
Add Code

Deep RBFNet: Point Cloud Feature Learning using Radial Basis Functions

no code implementations • 11 Dec 2018 • Weikai Chen, Xiaoguang Han, Guanbin Li, Chao Chen, Jun Xing, Yajie Zhao, Hao Li

Three-dimensional object recognition has recently achieved great progress thanks to the development of effective point cloud-based learning frameworks, such as PointNet and its extensions.

3D Object Recognition

Paper
Add Code

Adversarial 3D Human Pose Estimation via Multimodal Depth Supervision

no code implementations • 21 Sep 2018 • Kun Zhou, Jinmiao Cai, Yao Li, Yulong Shi, Xiaoguang Han, Nianjuan Jiang, Kui Jia, Jiangbo Lu

In this paper, a novel deep-learning based framework is proposed to infer 3D human poses from a single image.

3D Human Pose Estimation

Paper
Add Code

CaricatureShop: Personalized and Photorealistic Caricature Sketching

no code implementations • 24 Jul 2018 • Xiaoguang Han, Kangcheng Hou, Dong Du, Yuda Qiu, Yizhou Yu, Kun Zhou, Shuguang Cui

To construct the mapping between 2D sketches and a vertex-wise scaling field, a novel deep learning architecture is developed.

Caricature Face Model

Paper
Add Code

FBI-Pose: Towards Bridging the Gap between 2D Images and 3D Human Poses using Forward-or-Backward Information

no code implementations • 25 Jun 2018 • Yulong Shi, Xiaoguang Han, Nianjuan Jiang, Kun Zhou, Kui Jia, Jiangbo Lu

Although significant advances have been made in the area of human poses estimation from images using deep Convolutional Neural Network (ConvNet), it remains a big challenge to perform 3D pose inference in-the-wild.

Ranked #227 on 3D Human Pose Estimation on Human3.6M

3D Human Pose Estimation

Paper
Add Code

Video Inpainting by Jointly Learning Temporal Structure and Spatial Details

no code implementations • 22 Jun 2018 • Chuan Wang, Haibin Huang, Xiaoguang Han, Jue Wang

We present a new data-driven video inpainting method for recovering missing regions of video frames.

Video Inpainting

Paper
Add Code

High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference

no code implementations • ICCV 2017 • Xiaoguang Han, Zhen Li, Haibin Huang, Evangelos Kalogerakis, Yizhou Yu

Our method is based on a new deep learning architecture consisting of two sub-networks: a global structure inference network and a local geometry refinement network.

Paper
Add Code

DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling

no code implementations • 7 Jun 2017 • Xiaoguang Han, Chang Gao, Yizhou Yu

This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features.

Caricature

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.