Search Results for author: Xiaowei Zhou

Found 95 papers, 51 papers with code

Generating Human Motion in 3D Scenes from Text Descriptions

no code implementations • 13 May 2024 • Zhi Cen, Huaijin Pi, Sida Peng, Zehong Shen, Minghui Yang, Shuai Zhu, Hujun Bao, Xiaowei Zhou

For motion generation, we design an object-centric scene representation for the generative model to focus on the target object, thereby reducing the scene complexity and facilitating the modeling of the relationship between human motions and the object.

Object

Paper
Add Code

MaPa: Text-driven Photorealistic Material Painting for 3D Shapes

no code implementations • 26 Apr 2024 • Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen, Nan Xue, Yujun Shen, Hujun Bao, Ruizhen Hu, Xiaowei Zhou

Instead of relying on extensive paired data, i. e., 3D meshes with material graphs and corresponding text descriptions, to train a material graph generative model, we propose to leverage the pre-trained 2D diffusion model as a bridge to connect the text and material graphs.

Paper
Add Code

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

no code implementations • 17 Apr 2024 • Xi Chen, Sida Peng, Dongchen Yang, YuAn Liu, Bowen Pan, Chengfei Lv, Xiaowei Zhou

This paper aims to recover object materials from posed images captured under an unknown static lighting condition.

Inverse Rendering Object

Paper
Add Code

SpatialTracker: Tracking Any 2D Pixels in 3D Space

no code implementations • 5 Apr 2024 • Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou

Recovering dense and long-range pixel motion in videos is a challenging problem.

Paper
Add Code

Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising

1 code implementation • 15 Mar 2024 • Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Du

To enhance the modeling of both global and local features, we have devised a convolution and attention fusion module aimed at capturing long-range dependencies and neighborhood spectral correlations.

Hyperspectral Image Denoising Image Denoising

Paper
Code

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

1 code implementation • 7 Mar 2024 • Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, Xiaowei Zhou

Furthermore, we find spatial variance exists in LoFTR's fine correlation module, which is adverse to matching accuracy.

3D Reconstruction Image Retrieval

432

Paper
Code

Reconstructing Close Human Interactions from Multiple Views

1 code implementation • 29 Jan 2024 • Qing Shuai, Zhiyuan Yu, Zhize Zhou, Lixin Fan, Haijun Yang, Can Yang, Xiaowei Zhou

This paper addresses the challenging task of reconstructing the poses of multiple individuals engaged in close interactions, captured by multiple calibrated cameras.

Pose Estimation

Paper
Code

AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model

no code implementations • 27 Jan 2024 • Beijia Chen, Yuefan Shen, Qing Shuai, Xiaowei Zhou, Kun Zhou, Youyi Zheng

In this paper, we introduce AniDress, a novel method for generating animatable human avatars in loose clothes using very sparse multi-view videos (4-8 in our setting).

Paper
Add Code

Street Gaussians for Modeling Dynamic Urban Scenes

no code implementations • 2 Jan 2024 • Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng

We introduce Street Gaussians, a new explicit scene representation that tackles all these limitations.

Paper
Add Code

SAM-guided Graph Cut for 3D Instance Segmentation

no code implementations • 13 Dec 2023 • Haoyu Guo, He Zhu, Sida Peng, Yuang Wang, Yujun Shen, Ruizhen Hu, Xiaowei Zhou

Experimental results on the ScanNet, ScanNet++ and KITTI-360 datasets demonstrate that our method achieves robust segmentation performance and can generalize across different types of scenes.

3D Instance Segmentation Segmentation +1

Paper
Add Code

EasyVolcap: Accelerating Neural Volumetric Video Research

1 code implementation • 11 Dec 2023 • Zhen Xu, Tao Xie, Sida Peng, Haotong Lin, Qing Shuai, Zhiyuan Yu, Guangzhao He, Jiaming Sun, Hujun Bao, Xiaowei Zhou

Volumetric video is a technology that digitally records dynamic events such as artistic performances, sporting events, and remote conversations.

564

Paper
Code

SIRe-IR: Inverse Rendering for BRDF Reconstruction with Shadow and Illumination Removal in High-Illuminance Scenes

1 code implementation • 19 Oct 2023 • ZiYi Yang, Yanzhen Chen, Xinyu Gao, Yazhen Yuan, Yu Wu, Xiaowei Zhou, Xiaogang Jin

Implicit neural representation has opened up new possibilities for inverse rendering.

Inverse Rendering

Paper
Code

4K4D: Real-Time 4D View Synthesis at 4K Resolution

no code implementations • 17 Oct 2023 • Zhen Xu, Sida Peng, Haotong Lin, Guangzhao He, Jiaming Sun, Yujun Shen, Hujun Bao, Xiaowei Zhou

Experiments show that our representation can be rendered at over 400 FPS on the DNA-Rendering dataset at 1080p resolution and 80 FPS on the ENeRF-Outdoor dataset at 4K resolution using an RTX 4090 GPU, which is 30x faster than previous methods and achieves the state-of-the-art rendering quality.

Paper
Add Code

Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes

no code implementations • 12 Oct 2023 • Haotong Lin, Sida Peng, Zhen Xu, Tao Xie, Xingyi He, Hujun Bao, Xiaowei Zhou

This paper aims to tackle the challenge of dynamic view synthesis from multi-view videos.

Novel View Synthesis

Paper
Add Code

Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models

no code implementations • ICCV 2023 • Huaijin Pi, Sida Peng, Minghui Yang, Xiaowei Zhou, Hujun Bao

This paper presents a novel approach to generating the 3D motion of a human interacting with a target object, with a focus on solving the challenge of synthesizing long-range and diverse motions, which could not be fulfilled by existing auto-regressive models or path planning-based methods.

Human-Object Interaction Detection

Paper
Add Code

PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes

1 code implementation • 19 Sep 2023 • Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao

Moreover, PanopticNeRF-360 enables omnidirectional rendering of high-fidelity, multi-view and spatiotemporally consistent appearance, semantic and instance labels.

Self-Driving Cars

209

Paper
Code

EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior

1 code implementation • 25 Aug 2023 • Zhipeng Hu, Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Changjie Fan, Xiaowei Zhou, Xin Yu

This limitation leads to the Janus problem, where multi-faced 3D models are generated under the guidance of such diffusion models.

Text to 3D

Paper
Code

Relightable and Animatable Neural Avatar from Sparse-View Video

no code implementations • 15 Aug 2023 • Zhen Xu, Sida Peng, Chen Geng, Linzhan Mou, Zihan Yan, Jiaming Sun, Hujun Bao, Xiaowei Zhou

Based on the HDQ algorithm, we leverage sphere tracing to efficiently estimate the surface intersection and light visibility.

Inverse Rendering

Paper
Add Code

CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

1 code implementation • 15 Aug 2023 • Hao Ouyang, Qiuyu Wang, Yuxi Xiao, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou, Qifeng Chen, Yujun Shen

We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i. e., rendered from the canonical content field) to each individual frame along the time axis. Given a target video, these two fields are jointly optimized to reconstruct it through a carefully tailored rendering pipeline. We advisedly introduce some regularizations into the optimization process, urging the canonical content field to inherit semantics (e. g., the object shape) from the video. With such a design, CoDeF naturally supports lifting image algorithms for video processing, in the sense that one can apply an image algorithm to the canonical image and effortlessly propagate the outcomes to the entire video with the aid of the temporal deformation field. We experimentally show that CoDeF is able to lift image-to-image translation to video-to-video translation and lift keypoint detection to keypoint tracking without any training. More importantly, thanks to our lifting strategy that deploys the algorithms on only one image, we achieve superior cross-frame consistency in processed videos compared to existing video-to-video translation approaches, and even manage to track non-rigid objects like water and smog. Project page can be found at https://qiuyu96. github. io/CoDeF/.

Image-to-Image Translation Keypoint Detection +1

4,775

Paper
Code

Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields

no code implementations • 24 Jul 2023 • Shangzhan Zhang, Sida Peng, Yinji ShenTu, Qing Shuai, Tianrun Chen, Kaicheng Yu, Hujun Bao, Xiaowei Zhou

We extensively evaluate our approach on various scenes and show that our approach achieves spatially and temporally consistent editing results.

Paper
Add Code

Detector-Free Structure from Motion

1 code implementation • 27 Jun 2023 • Xingyi He, Jiaming Sun, Yifan Wang, Sida Peng, QiXing Huang, Hujun Bao, Xiaowei Zhou

We propose a new detector-free SfM framework to draw benefits from the recent success of detector-free matchers to avoid the early determination of keypoints, while solving the multi-view inconsistency issue of detector-free matchers.

Keypoint Detection

493

Paper
Code

Neural Scene Chronology

1 code implementation • CVPR 2023 • Haotong Lin, Qianqian Wang, Ruojin Cai, Sida Peng, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

Specifically, we represent the scene as a space-time radiance field with a per-image illumination embedding, where temporally-varying scene changes are encoded using a set of learned step functions.

111

Paper
Code

Learning Human Mesh Recovery in 3D Scenes

no code implementations • CVPR 2023 • Zehong Shen, Zhi Cen, Sida Peng, Qing Shuai, Hujun Bao, Xiaowei Zhou

We present a novel method for recovering the absolute pose and shape of a human in a pre-scanned scene given a single image.

Human Mesh Recovery

Paper
Add Code

AutoRecon: Automated 3D Object Discovery and Reconstruction

no code implementations • CVPR 2023 • Yuang Wang, Xingyi He, Sida Peng, Haotong Lin, Hujun Bao, Xiaowei Zhou

A fully automated object reconstruction pipeline is crucial for digital content creation.

3D Reconstruction Object +2

Paper
Add Code

EasyHeC: Accurate and Automatic Hand-eye Calibration via Differentiable Rendering and Space Exploration

no code implementations • 2 May 2023 • Linghao Chen, Yuzhe Qin, Xiaowei Zhou, Hao Su

Hand-eye calibration is a critical task in robotics, as it directly affects the efficacy of critical operations such as manipulation and grasping.

Paper
Add Code

TensoIR: Tensorial Inverse Rendering

1 code implementation • CVPR 2023 • Haian Jin, Isabella Liu, Peijia Xu, Xiaoshuai Zhang, Songfang Han, Sai Bi, Xiaowei Zhou, Zexiang Xu, Hao Su

We propose TensoIR, a novel inverse rendering approach based on tensor factorization and neural fields.

Inverse Rendering Novel View Synthesis

213

Paper
Code

UTSGAN: Unseen Transition Suss GAN for Transition-Aware Image-to-image Translation

no code implementations • 24 Apr 2023 • Yaxin Shi, Xiaowei Zhou, Ping Liu, Ivor W. Tsang

Furthermore, we propose the use of transition consistency, defined on the transition variable, to enable regularization of consistency on unobserved translations, which is omitted in previous works.

Attribute Image-to-Image Translation +1

Paper
Add Code

Long-term Visual Localization with Mobile Sensors

no code implementations • CVPR 2023 • Shen Yan, Yu Liu, Long Wang, Zehong Shen, Zhen Peng, Haomin Liu, Maojun Zhang, Guofeng Zhang, Xiaowei Zhou

Despite the remarkable advances in image matching and pose estimation, image-based localization of a camera in a temporally-varying outdoor environment is still a challenging problem due to huge appearance disparity between query and reference images caused by illumination, seasonal and structural changes.

Image-Based Localization Pose Estimation +1

Paper
Add Code

Representing Volumetric Videos as Dynamic MLP Maps

no code implementations • CVPR 2023 • Sida Peng, Yunzhi Yan, Qing Shuai, Hujun Bao, Xiaowei Zhou

This paper introduces a novel representation of volumetric videos for real-time view synthesis of dynamic scenes.

Decoder

Paper
Add Code

Perceiving Unseen 3D Objects by Poking the Objects

no code implementations • 26 Feb 2023 • Linghao Chen, Yunzhou Song, Hujun Bao, Xiaowei Zhou

We present a novel approach to interactive 3D object perception for robots.

3D Reconstruction Robotic Grasping

Paper
Add Code

Learning Neural Volumetric Representations of Dynamic Humans in Minutes

1 code implementation • CVPR 2023 • Chen Geng, Sida Peng, Zhen Xu, Hujun Bao, Xiaowei Zhou

In this paper, we propose a novel method for learning neural volumetric videos of dynamic humans from sparse view videos in minutes with competitive visual quality.

149

Paper
Code

Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask

no code implementations • CVPR 2023 • Shangzhan Zhang, Sida Peng, Tianrun Chen, Linzhan Mou, Haotong Lin, Kaicheng Yu, Yiyi Liao, Xiaowei Zhou

We introduce a novel approach that takes a single semantic mask as input to synthesize multi-view consistent color images of natural scenes, trained with a collection of single images from the Internet.

3D-Aware Image Synthesis

Paper
Add Code

OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD Models

no code implementations • 18 Jan 2023 • Xingyi He, Jiaming Sun, Yuang Wang, Di Huang, Hujun Bao, Xiaowei Zhou

We propose a new method for object pose estimation without CAD models.

Keypoint Detection Object

Paper
Add Code

Deep Active Contours for Real-time 6-DoF Object Tracking

no code implementations • ICCV 2023 • Long Wang, Shen Yan, Jianan Zhen, Yu Liu, Maojun Zhang, Guofeng Zhang, Xiaowei Zhou

Specifically, given an initial pose, we project the object model to the image plane to obtain the initial contour and use a lightweight network to predict how the contour should move to match the true object boundary, which provides the gradients to optimize the object pose.

Computational Efficiency Object +1

Paper
Add Code

Ponder: Point Cloud Pre-training via Neural Rendering

no code implementations • ICCV 2023 • Di Huang, Sida Peng, Tong He, Honghui Yang, Xiaowei Zhou, Wanli Ouyang

We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural rendering.

3D Reconstruction Image Generation +2

Paper
Add Code

Reconstructing Hand-Held Objects from Monocular Video

no code implementations • 30 Nov 2022 • Di Huang, Xiaopeng Ji, Xingyi He, Jiaming Sun, Tong He, Qing Shuai, Wanli Ouyang, Xiaowei Zhou

The key idea is that the hand motion naturally provides multiple views of the object and the motion can be reliably estimated by a hand pose tracker.

Hand Pose Estimation Object

Paper
Add Code

QuickPose: Real-time Multi-view Multi-person Pose Estimation in Crowded Scenes

no code implementations • SIGGRAPH 2022 • Zhize Zhou, Qing Shuai, Yize Wang, Qi Fang, Xiaopeng Ji, Fashuai Li, Hujun Bao, Xiaowei Zhou

The key challenge of this problem is to efficiently match 2D observations across multiple views.

Ranked #2 on 3D Multi-Person Pose Estimation on Shelf

2D Pose Estimation 3D Multi-Person Pose Estimation +1

Paper
Add Code

PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

1 code implementation • CVPR 2022 • Yiming Xie, Matheus Gadelha, Fengting Yang, Xiaowei Zhou, Huaizu Jiang

We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video.

3D Plane Detection

277

Paper
Code

LADDER: Latent Boundary-guided Adversarial Training

1 code implementation • 8 Jun 2022 • Xiaowei Zhou, Ivor W. Tsang, Jie Yin

To achieve a better trade-off between standard accuracy and adversarial robustness, we propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining (LADDER) that adversarially trains DNN models on latent boundary-guided adversarial examples.

Adversarial Robustness

Paper
Code

Neural 3D Reconstruction in the Wild

1 code implementation • 25 May 2022 • Jiaming Sun, Xi Chen, Qianqian Wang, Zhengqi Li, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

We are witnessing an explosion of neural implicit representations in computer vision and graphics.

3D Reconstruction Surface Reconstruction

687

Paper
Code

OnePose: One-Shot Object Pose Estimation without CAD Models

1 code implementation • CVPR 2022 • Jiaming Sun, ZiHao Wang, Siyu Zhang, Xingyi He, Hongcheng Zhao, Guofeng Zhang, Xiaowei Zhou

We propose a new method named OnePose for object pose estimation.

6D Pose Estimation Graph Attention +2

904

Paper
Code

Ray Priors through Reprojection: Improving Neural Radiance Fields for Novel View Extrapolation

no code implementations • CVPR 2022 • Jian Zhang, Yuanqing Zhang, Huan Fu, Xiaowei Zhou, Bowen Cai, Jinchi Huang, Rongfei Jia, Binqiang Zhao, Xing Tang

Neural Radiance Fields (NeRF) have emerged as a potent paradigm for representing scenes and synthesizing photo-realistic images.

Image Generation

Paper
Add Code

Neural 3D Scene Reconstruction with the Manhattan-world Assumption

1 code implementation • CVPR 2022 • Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, Xiaowei Zhou

Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network.

2D Semantic Segmentation 3D Reconstruction +2

488

Paper
Code

Modeling Indirect Illumination for Inverse Rendering

1 code implementation • CVPR 2022 • Yuanqing Zhang, Jiaming Sun, Xingyi He, Huan Fu, Rongfei Jia, Xiaowei Zhou

The key insight is that indirect illumination can be conveniently derived from the neural radiance field learned from input images instead of being estimated jointly with direct illumination and materials.

Inverse Rendering

166

Paper
Code

Semantic keypoint-based pose estimation from single RGB frames

1 code implementation • 12 Apr 2022 • Karl Schmeckpeper, Philip R. Osteen, Yufu Wang, Georgios Pavlakos, Kenneth Chaney, Wyatt Jordan, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background.

Object Pose Estimation

Paper
Code

Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

1 code implementation • 29 Mar 2022 • Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Lanyun Zhu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao

In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives.

Instance Segmentation Scene Segmentation

209

Paper
Code

Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation

1 code implementation • CVPR 2022 • Xian Liu, Qianyi Wu, Hang Zhou, Yinghao Xu, Rui Qian, Xinyi Lin, Xiaowei Zhou, Wayne Wu, Bo Dai, Bolei Zhou

To enhance the quality of synthesized gestures, we develop a contrastive learning strategy based on audio-text alignment for better audio representations.

Ranked #3 on Gesture Generation on TED Gesture Dataset

Contrastive Learning Gesture Generation

119

Paper
Code

Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos

1 code implementation • 15 Mar 2022 • Sida Peng, Zhen Xu, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Hujun Bao, Xiaowei Zhou

Some recent works have proposed to decompose a non-rigidly deforming scene into a canonical neural radiance field and a set of deformation fields that map observation-space points to the canonical space, thereby enabling them to learn the dynamic scene from images.

493

Paper
Code

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

1 code implementation • 13 Feb 2022 • Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou

Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.

Paper
Code

Efficient Neural Radiance Fields for Interactive Free-viewpoint Video

no code implementations • 2 Dec 2021 • Haotong Lin, Sida Peng, Zhen Xu, Yunzhi Yan, Qing Shuai, Hujun Bao, Xiaowei Zhou

We propose a novel scene representation, called ENeRF, for the fast creation of interactive free-viewpoint videos.

Depth Estimation Depth Prediction +1

Paper
Add Code

Edge but not Least: Cross-View Graph Pooling

no code implementations • 24 Sep 2021 • Xiaowei Zhou, Jie Yin, Ivor W. Tsang

Graph neural networks have emerged as a powerful model for graph representation learning to undertake graph-level prediction tasks.

Graph Classification Graph Regression +1

Paper
Add Code

Neural Rays for Occlusion-aware Image-based Rendering

1 code implementation • CVPR 2022 • YuAn Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, Wenping Wang

On such a 3D point, these generalization methods will include inconsistent image features from invisible views, which interfere with the radiance field construction.

Neural Rendering Novel View Synthesis +1

400

Paper
Code

VS-Net: Voting with Segmentation for Visual Localization

1 code implementation • CVPR 2021 • Zhaoyang Huang, Han Zhou, Yijin Li, Bangbang Yang, Yan Xu, Xiaowei Zhou, Hujun Bao, Guofeng Zhang, Hongsheng Li

To address this problem, we propose a novel visual localization framework that establishes 2D-to-3D correspondences between the query image and the 3D map with a series of learnable scene-specific landmarks.

Segmentation Semantic Segmentation +1

Paper
Code

Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies

1 code implementation • ICCV 2021 • Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, Hujun Bao

Moreover, the learned blend weight fields can be combined with input skeletal motions to generate new deformation fields to animate the human model.

493

Paper
Code

Reconstructing 3D Human Pose by Watching Humans in the Mirror

1 code implementation • CVPR 2021 • Qi Fang, Qing Shuai, Junting Dong, Hujun Bao, Xiaowei Zhou

In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.

3D Pose Estimation

3,375

Paper
Code

LoFTR: Detector-Free Local Feature Matching with Transformers

5 code implementations • CVPR 2021 • Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, Xiaowei Zhou

We present a novel method for local image feature matching.

Image Matching Visual Localization

9,510

Paper
Code

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video

3 code implementations • CVPR 2021 • Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, Hujun Bao

We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video.

3D Reconstruction 3D Scene Reconstruction +1

1,954

Paper
Code

Generative Transition Mechanism to Image-to-Image Translation via Encoded Transformation

no code implementations • 9 Mar 2021 • Yaxin Shi, Xiaowei Zhou, Ping Liu, Ivor Tsang

To benefit the generalization ability of the translation model, we propose transition encoding to facilitate explicit regularization of these two {kinds} of consistencies on unseen transitions.

Attribute Image Reconstruction +2

Paper
Add Code

Human-Understandable Decision Making for Visual Recognition

no code implementations • 5 Mar 2021 • Xiaowei Zhou, Jie Yin, Ivor Tsang, Chen Wang

The widespread use of deep neural networks has achieved substantial success in many tasks.

Decision Making

Paper
Add Code

You Don't Only Look Once: Constructing Spatial-Temporal Memory for Integrated 3D Object Detection and Tracking

no code implementations • ICCV 2021 • Jiaming Sun, Yiming Xie, Siyu Zhang, Linghao Chen, Guofeng Zhang, Hujun Bao, Xiaowei Zhou

In this work, we propose a novel system for integrated 3D object detection and tracking, which uses a dynamic object occupancy map and previous object states as spatial-temporal memory to assist object detection in future frames.

3D Object Detection Object +2

Paper
Add Code

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

3 code implementations • CVPR 2021 • Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou

To this end, we propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.

Novel View Synthesis Representation Learning

3,375

Paper
Code

Learning Hybrid Representations for Automatic 3D Vessel Centerline Extraction

no code implementations • 14 Dec 2020 • Jiafa He, Chengwei Pan, Can Yang, Ming Zhang, Yang Wang, Xiaowei Zhou, Yizhou Yu

The main idea is to use CNNs to learn local appearances of vessels in image crops while using another point-cloud network to learn the global geometry of vessels in the entire image.

Representation Learning

Paper
Add Code

SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation

1 code implementation • ECCV 2020 • Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao, Xiaowei Zhou

Recovering multi-person 3D poses with absolute scales from a single RGB image is a challenging problem due to the inherent depth and scale ambiguity from a single view.

Ranked #11 on 3D Multi-Person Pose Estimation (absolute) on MuPoTS-3D

2D Pose Estimation 3D Depth Estimation +3

239

Paper
Code

Motion Capture from Internet Videos

2 code implementations • ECCV 2020 • Junting Dong, Qing Shuai, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

Therefore, we propose to capture human motion by jointly analyzing these Internet videos instead of using single videos separately.

Pose Estimation

3,375

Paper
Code

Coherent Reconstruction of Multiple Humans from a Single Image

1 code implementation • CVPR 2020 • Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.

Ranked #2 on 3D Human Reconstruction on AGORA

3D Depth Estimation 3D Human Reconstruction +4

364

Paper
Code

Learning Feature Descriptors using Camera Pose Supervision

1 code implementation • ECCV 2020 • Qianqian Wang, Xiaowei Zhou, Bharath Hariharan, Noah Snavely

Recent research on learned visual descriptors has shown promising improvements in correspondence estimation, a key component of many 3D vision tasks.

180

Paper
Code

Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation

1 code implementation • CVPR 2020 • Jiaming Sun, Linghao Chen, Yiming Xie, Siyu Zhang, Qinhong Jiang, Xiaowei Zhou, Hujun Bao

In this paper, we propose a novel system named Disp R-CNN for 3D object detection from stereo images.

Ranked #3 on 3D Object Detection From Stereo Images on KITTI Cyclists Moderate

3D Object Detection From Stereo Images Disparity Estimation +2

210

Paper
Code

Monocular Human Pose and Shape Reconstruction using Part Differentiable Rendering

no code implementations • 24 Mar 2020 • Min Wang, Feng Qiu, Wentao Liu, Chen Qian, Xiaowei Zhou, Lizhuang Ma

In this paper, we introduce body part segmentation as critical supervision.

Ranked #97 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation 3D Pose Estimation +3

Paper
Add Code

Deep Snake for Real-Time Instance Segmentation

1 code implementation • CVPR 2020 • Sida Peng, Wen Jiang, Huaijin Pi, Xiuli Li, Hujun Bao, Xiaowei Zhou

Based on deep snake, we develop a two-stage pipeline for instance segmentation: initial contour proposal and contour deformation, which can handle errors in object localization.

Ranked #2 on Semantic Contour Prediction on Sbd val

Object Object Localization +3

1,147

Paper
Code

GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs

1 code implementation • NeurIPS 2019 • Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, Xiaowei Zhou

Instead of feature pooling, we use group convolutions to exploit underlying structures of the extracted features on the group, resulting in descriptors that are both discriminative and provably invariant to the group of transformations.

Pose Estimation

191

Paper
Code

Latent Adversarial Defence with Boundary-guided Generation

no code implementations • 16 Jul 2019 • Xiaowei Zhou, Ivor W. Tsang, Jie Yin

The proposed LAD method improves the robustness of a DNN model through adversarial training on generated adversarial examples.

Paper
Add Code

Learning Transformation Synchronization

1 code implementation • CVPR 2019 • Xiangru Huang, Zhenxiao Liang, Xiaowei Zhou, Yao Xie, Leonidas Guibas, Qi-Xing Huang

Our approach alternates between transformation synchronization using weighted relative transformations and predicting new weights of the input relative transformations using a neural network.

Paper
Code

Fast and Robust Multi-Person 3D Pose Estimation from Multiple Views

4 code implementations • CVPR 2019 • Junting Dong, Wen Jiang, Qi-Xing Huang, Hujun Bao, Xiaowei Zhou

This paper addresses the problem of 3D pose estimation for multiple people in a few calibrated camera views.

Ranked #12 on 3D Multi-Person Pose Estimation on Campus

3D Multi-Person Pose Estimation 3D Pose Estimation

3,375

Paper
Code

Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion

1 code implementation • CVPR 2019 • Zhenpei Yang, Jeffrey Z. Pan, Linjie Luo, Xiaowei Zhou, Kristen Grauman, Qi-Xing Huang

In particular, instead of only performing scene completion from each individual scan, our approach alternates between relative pose estimation and scene completion.

Pose Estimation

149

Paper
Code

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation

4 code implementations • CVPR 2019 • Sida Peng, Yu-An Liu, Qi-Xing Huang, Hujun Bao, Xiaowei Zhou

We further create a Truncation LINEMOD dataset to validate the robustness of our approach against truncation.

Ranked #2 on 6D Pose Estimation using RGB on YCB-Video (Mean AUC metric)

6D Pose Estimation using RGB

792

Paper
Code

Path-Invariant Map Networks

1 code implementation • CVPR 2019 • Zaiwei Zhang, Zhenxiao Liang, Lemeng Wu, Xiaowei Zhou, Qi-Xing Huang

Optimizing a network of maps among a collection of objects/domains (or map synchronization) is a central problem across computer vision and many other relevant fields.

3D Semantic Segmentation Scene Segmentation +1

Paper
Code

Ordinal Depth Supervision for 3D Human Pose Estimation

1 code implementation • CVPR 2018 • Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

This information can be acquired by human annotators for a wide range of images and poses.

Ranked #1 on Monocular 3D Human Pose Estimation on Human3.6M (Use Video Sequence metric)

Monocular 3D Human Pose Estimation

111

Paper
Code

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

no code implementations • CVPR 2018 • Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.

Ranked #120 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation

Paper
Add Code

Human Motion Capture Using a Drone

1 code implementation • 17 Apr 2018 • Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis

Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments.

Paper
Code

Multi-Image Semantic Matching by Mining Consistent Features

2 code implementations • CVPR 2018 • Qianqian Wang, Xiaowei Zhou, Kostas Daniilidis

This work proposes a multi-image matching method to estimate semantic correspondences across multiple images.

Graph Matching Object

Paper
Code

Fast Multi-Image Matching via Density-Based Clustering

no code implementations • ICCV 2017 • Roberto Tron, Xiaowei Zhou, Carlos Esteves, Kostas Daniilidis

We consider the problem of finding consistent matches across multiple images.

Clustering

Paper
Add Code

Polar Transformer Networks

1 code implementation • ICLR 2018 • Carlos Esteves, Christine Allen-Blanchette, Xiaowei Zhou, Kostas Daniilidis

The result is a network invariant to translation and equivariant to both rotation and scale.

Rotated MNIST Translation

Paper
Code

Harvesting Multiple Views for Marker-less 3D Human Pose Annotations

no code implementations • CVPR 2017 • Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

In this paper, we present a geometry-driven approach to automatically collect annotations for human pose prediction tasks.

Ranked #28 on Weakly-supervised 3D Human Pose Estimation on Human3.6M

Pose Prediction Weakly-supervised 3D Human Pose Estimation

Paper
Add Code

6-DoF Object Pose from Semantic Keypoints

1 code implementation • 14 Mar 2017 • Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis

This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image.

Ranked #1 on Keypoint Detection on Pascal3D+

Keypoint Detection Object +1

Paper
Code

MonoCap: Monocular Human Motion Capture using a CNN Coupled with a Geometric Prior

1 code implementation • 9 Jan 2017 • Xiaowei Zhou, Menglong Zhu, Georgios Pavlakos, Spyridon Leonardos, Kostantinos G. Derpanis, Kostas Daniilidis

Recovering 3D full-body human pose is a challenging problem with many applications.

Paper
Code

Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose

3 code implementations • CVPR 2017 • Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

This paper addresses the challenge of 3D human pose estimation from a single color image.

Ranked #16 on 3D Human Pose Estimation on HumanEva-I

3D Human Pose Estimation

Paper
Code

Single Image Pop-Up From Discriminatively Learned Parts

no code implementations • ICCV 2015 • Menglong Zhu, Xiaowei Zhou, Kostas Daniilidis

We introduce a new approach for estimating a fine grained 3D shape and continuous pose of an object from a single image.

Paper
Add Code

Sparseness Meets Deepness: 3D Human Pose Estimation from Monocular Video

1 code implementation • CVPR 2016 • Xiaowei Zhou, Menglong Zhu, Spyridon Leonardos, Kosta Derpanis, Kostas Daniilidis

Here, two cases are considered: (i) the image locations of the human joints are provided and (ii) the image locations of joints are unknown.

Ranked #38 on Monocular 3D Human Pose Estimation on Human3.6M

2D Pose Estimation 3D Pose Estimation +1

Paper
Code

Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach

no code implementations • 14 Sep 2015 • Xiaowei Zhou, Menglong Zhu, Spyridon Leonardos, Kostas Daniilidis

We investigate the problem of estimating the 3D shape of an object defined by a set of 3D landmarks, given their 2D correspondences in a single image.

Ranked #127 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation

Paper
Add Code

Multi-Image Matching via Fast Alternating Minimization

1 code implementation • ICCV 2015 • Xiaowei Zhou, Menglong Zhu, Kostas Daniilidis

In this paper we propose a global optimization-based approach to jointly matching a set of images.

Paper
Code

Pose and Shape Estimation with Discriminatively Learned Parts

no code implementations • 1 Feb 2015 • Menglong Zhu, Xiaowei Zhou, Kostas Daniilidis

We introduce a new approach for estimating the 3D pose and the 3D shape of an object from a single image.

Paper
Add Code

3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

no code implementations • CVPR 2015 • Xiaowei Zhou, Spyridon Leonardos, Xiaoyan Hu, Kostas Daniilidis

We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image.

Paper
Add Code

Low-Rank Modeling and Its Applications in Image Analysis

no code implementations • 15 Jan 2014 • Xiaowei Zhou, Can Yang, Hongyu Zhao, Weichuan Yu

In this paper, we review the recent advance of low-rank modeling, the state-of-the-art algorithms, and related applications in image analysis.

Collaborative Filtering Matrix Completion

Paper
Add Code

Active Contours with Group Similarity

no code implementations • CVPR 2013 • Xiaowei Zhou, Xiaojie Huang, James S. Duncan, Weichuan Yu

In this paper, we propose to use the group similarity of object shapes in multiple images as a prior to aid segmentation, which can be interpreted as an unsupervised approach of shape prior modeling.

Image Segmentation Semantic Segmentation

Paper
Add Code

Moving Object Detection by Detecting Contiguous Outliers in the Low-Rank Representation

no code implementations • 5 Sep 2011 • Xiaowei Zhou, Can Yang, Weichuan Yu

To automate the analysis, object detection without a separate training phase becomes a critical task.

Moving Object Detection Object +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.