Search Results for author: Xiaowei Zhou

Found 93 papers, 51 papers with code

CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

1 code implementation15 Aug 2023 Hao Ouyang, Qiuyu Wang, Yuxi Xiao, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou, Qifeng Chen, Yujun Shen

We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i. e., rendered from the canonical content field) to each individual frame along the time axis. Given a target video, these two fields are jointly optimized to reconstruct it through a carefully tailored rendering pipeline. We advisedly introduce some regularizations into the optimization process, urging the canonical content field to inherit semantics (e. g., the object shape) from the video. With such a design, CoDeF naturally supports lifting image algorithms for video processing, in the sense that one can apply an image algorithm to the canonical image and effortlessly propagate the outcomes to the entire video with the aid of the temporal deformation field. We experimentally show that CoDeF is able to lift image-to-image translation to video-to-video translation and lift keypoint detection to keypoint tracking without any training. More importantly, thanks to our lifting strategy that deploys the algorithms on only one image, we achieve superior cross-frame consistency in processed videos compared to existing video-to-video translation approaches, and even manage to track non-rigid objects like water and smog. Project page can be found at https://qiuyu96. github. io/CoDeF/.

Image-to-Image Translation Keypoint Detection +1

Motion Capture from Internet Videos

2 code implementations ECCV 2020 Junting Dong, Qing Shuai, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

Therefore, we propose to capture human motion by jointly analyzing these Internet videos instead of using single videos separately.

Pose Estimation

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

3 code implementations CVPR 2021 Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, Xiaowei Zhou

To this end, we propose Neural Body, a new human body representation which assumes that the learned neural representations at different frames share the same set of latent codes anchored to a deformable mesh, so that the observations across frames can be naturally integrated.

Novel View Synthesis Representation Learning

Reconstructing 3D Human Pose by Watching Humans in the Mirror

1 code implementation CVPR 2021 Qi Fang, Qing Shuai, Junting Dong, Hujun Bao, Xiaowei Zhou

In this paper, we introduce the new task of reconstructing 3D human pose from a single image in which we can see the person and the person's image through a mirror.

3D Pose Estimation

Deep Snake for Real-Time Instance Segmentation

1 code implementation CVPR 2020 Sida Peng, Wen Jiang, Huaijin Pi, Xiuli Li, Hujun Bao, Xiaowei Zhou

Based on deep snake, we develop a two-stage pipeline for instance segmentation: initial contour proposal and contour deformation, which can handle errors in object localization.

Object Object Localization +3

PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation

4 code implementations CVPR 2019 Sida Peng, Yu-An Liu, Qi-Xing Huang, Hujun Bao, Xiaowei Zhou

We further create a Truncation LINEMOD dataset to validate the robustness of our approach against truncation.

Ranked #2 on 6D Pose Estimation using RGB on YCB-Video (Mean AUC metric)

6D Pose Estimation using RGB

Neural 3D Reconstruction in the Wild

1 code implementation25 May 2022 Jiaming Sun, Xi Chen, Qianqian Wang, Zhengqi Li, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

We are witnessing an explosion of neural implicit representations in computer vision and graphics.

3D Reconstruction Surface Reconstruction

Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies

1 code implementation ICCV 2021 Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, Hujun Bao

Moreover, the learned blend weight fields can be combined with input skeletal motions to generate new deformation fields to animate the human model.

Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos

1 code implementation15 Mar 2022 Sida Peng, Zhen Xu, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Hujun Bao, Xiaowei Zhou

Some recent works have proposed to decompose a non-rigidly deforming scene into a canonical neural radiance field and a set of deformation fields that map observation-space points to the canonical space, thereby enabling them to learn the dynamic scene from images.

Neural 3D Scene Reconstruction with the Manhattan-world Assumption

1 code implementation CVPR 2022 Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, Xiaowei Zhou

Based on the Manhattan-world assumption, planar constraints are employed to regularize the geometry in floor and wall regions predicted by a 2D semantic segmentation network.

2D Semantic Segmentation 3D Reconstruction +2

EasyVolcap: Accelerating Neural Volumetric Video Research

1 code implementation11 Dec 2023 Zhen Xu, Tao Xie, Sida Peng, Haotong Lin, Qing Shuai, Zhiyuan Yu, Guangzhao He, Jiaming Sun, Hujun Bao, Xiaowei Zhou

Volumetric video is a technology that digitally records dynamic events such as artistic performances, sporting events, and remote conversations.

Neural Rays for Occlusion-aware Image-based Rendering

1 code implementation CVPR 2022 YuAn Liu, Sida Peng, Lingjie Liu, Qianqian Wang, Peng Wang, Christian Theobalt, Xiaowei Zhou, Wenping Wang

On such a 3D point, these generalization methods will include inconsistent image features from invisible views, which interfere with the radiance field construction.

Neural Rendering Novel View Synthesis +1

Detector-Free Structure from Motion

1 code implementation27 Jun 2023 Xingyi He, Jiaming Sun, Yifan Wang, Sida Peng, QiXing Huang, Hujun Bao, Xiaowei Zhou

We propose a new detector-free SfM framework to draw benefits from the recent success of detector-free matchers to avoid the early determination of keypoints, while solving the multi-view inconsistency issue of detector-free matchers.

Keypoint Detection

Coherent Reconstruction of Multiple Humans from a Single Image

1 code implementation CVPR 2020 Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas Daniilidis

Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.

3D Depth Estimation 3D Human Reconstruction +4

PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

1 code implementation CVPR 2022 Yiming Xie, Matheus Gadelha, Fengting Yang, Xiaowei Zhou, Huaizu Jiang

We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video.

3D Plane Detection

Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed

1 code implementation7 Mar 2024 Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, Xiaowei Zhou

Furthermore, we find spatial variance exists in LoFTR's fine correlation module, which is adverse to matching accuracy.

3D Reconstruction Image Retrieval

SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation

1 code implementation ECCV 2020 Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao, Xiaowei Zhou

Recovering multi-person 3D poses with absolute scales from a single RGB image is a challenging problem due to the inherent depth and scale ambiguity from a single view.

2D Pose Estimation 3D Depth Estimation +3

Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

1 code implementation29 Mar 2022 Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Lanyun Zhu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao

In this work, we present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels from easy-to-obtain coarse 3D bounding primitives.

Instance Segmentation Scene Segmentation

PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes

1 code implementation19 Sep 2023 Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao

Moreover, PanopticNeRF-360 enables omnidirectional rendering of high-fidelity, multi-view and spatiotemporally consistent appearance, semantic and instance labels.

Self-Driving Cars

GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs

1 code implementation NeurIPS 2019 Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, Xiaowei Zhou

Instead of feature pooling, we use group convolutions to exploit underlying structures of the extracted features on the group, resulting in descriptors that are both discriminative and provably invariant to the group of transformations.

Pose Estimation

Learning Feature Descriptors using Camera Pose Supervision

1 code implementation ECCV 2020 Qianqian Wang, Xiaowei Zhou, Bharath Hariharan, Noah Snavely

Recent research on learned visual descriptors has shown promising improvements in correspondence estimation, a key component of many 3D vision tasks.

Modeling Indirect Illumination for Inverse Rendering

1 code implementation CVPR 2022 Yuanqing Zhang, Jiaming Sun, Xingyi He, Huan Fu, Rongfei Jia, Xiaowei Zhou

The key insight is that indirect illumination can be conveniently derived from the neural radiance field learned from input images instead of being estimated jointly with direct illumination and materials.

Inverse Rendering

Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion

1 code implementation CVPR 2019 Zhenpei Yang, Jeffrey Z. Pan, Linjie Luo, Xiaowei Zhou, Kristen Grauman, Qi-Xing Huang

In particular, instead of only performing scene completion from each individual scan, our approach alternates between relative pose estimation and scene completion.

Pose Estimation

Learning Neural Volumetric Representations of Dynamic Humans in Minutes

1 code implementation CVPR 2023 Chen Geng, Sida Peng, Zhen Xu, Hujun Bao, Xiaowei Zhou

In this paper, we propose a novel method for learning neural volumetric videos of dynamic humans from sparse view videos in minutes with competitive visual quality.

Neural Scene Chronology

1 code implementation CVPR 2023 Haotong Lin, Qianqian Wang, Ruojin Cai, Sida Peng, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

Specifically, we represent the scene as a space-time radiance field with a per-image illumination embedding, where temporally-varying scene changes are encoded using a set of learned step functions.

6-DoF Object Pose from Semantic Keypoints

1 code implementation14 Mar 2017 Georgios Pavlakos, Xiaowei Zhou, Aaron Chan, Konstantinos G. Derpanis, Kostas Daniilidis

This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image.

Keypoint Detection Object +1

VS-Net: Voting with Segmentation for Visual Localization

1 code implementation CVPR 2021 Zhaoyang Huang, Han Zhou, Yijin Li, Bangbang Yang, Yan Xu, Xiaowei Zhou, Hujun Bao, Guofeng Zhang, Hongsheng Li

To address this problem, we propose a novel visual localization framework that establishes 2D-to-3D correspondences between the query image and the 3D map with a series of learnable scene-specific landmarks.

Segmentation Semantic Segmentation +1

Path-Invariant Map Networks

1 code implementation CVPR 2019 Zaiwei Zhang, Zhenxiao Liang, Lemeng Wu, Xiaowei Zhou, Qi-Xing Huang

Optimizing a network of maps among a collection of objects/domains (or map synchronization) is a central problem across computer vision and many other relevant fields.

3D Semantic Segmentation Scene Segmentation +1

Multi-Image Semantic Matching by Mining Consistent Features

2 code implementations CVPR 2018 Qianqian Wang, Xiaowei Zhou, Kostas Daniilidis

This work proposes a multi-image matching method to estimate semantic correspondences across multiple images.

Graph Matching Object

Learning Transformation Synchronization

1 code implementation CVPR 2019 Xiangru Huang, Zhenxiao Liang, Xiaowei Zhou, Yao Xie, Leonidas Guibas, Qi-Xing Huang

Our approach alternates between transformation synchronization using weighted relative transformations and predicting new weights of the input relative transformations using a neural network.

Reconstructing Close Human Interactions from Multiple Views

1 code implementation29 Jan 2024 Qing Shuai, Zhiyuan Yu, Zhize Zhou, Lixin Fan, Haijun Yang, Can Yang, Xiaowei Zhou

This paper addresses the challenging task of reconstructing the poses of multiple individuals engaged in close interactions, captured by multiple calibrated cameras.

Pose Estimation

Visual Sound Localization in the Wild by Cross-Modal Interference Erasing

1 code implementation13 Feb 2022 Xian Liu, Rui Qian, Hang Zhou, Di Hu, Weiyao Lin, Ziwei Liu, Bolei Zhou, Xiaowei Zhou

Specifically, we observe that the previous practice of learning only a single audio representation is insufficient due to the additive nature of audio signals.

Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising

1 code implementation15 Mar 2024 Shuai Hu, Feng Gao, Xiaowei Zhou, Junyu Dong, Qian Du

To enhance the modeling of both global and local features, we have devised a convolution and attention fusion module aimed at capturing long-range dependencies and neighborhood spectral correlations.

Hyperspectral Image Denoising Image Denoising

Human Motion Capture Using a Drone

1 code implementation17 Apr 2018 Xiaowei Zhou, Sikang Liu, Georgios Pavlakos, Vijay Kumar, Kostas Daniilidis

Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments.

EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior

1 code implementation25 Aug 2023 Zhipeng Hu, Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Changjie Fan, Xiaowei Zhou, Xin Yu

This limitation leads to the Janus problem, where multi-faced 3D models are generated under the guidance of such diffusion models.

Text to 3D

Multi-Image Matching via Fast Alternating Minimization

1 code implementation ICCV 2015 Xiaowei Zhou, Menglong Zhu, Kostas Daniilidis

In this paper we propose a global optimization-based approach to jointly matching a set of images.

Semantic keypoint-based pose estimation from single RGB frames

1 code implementation12 Apr 2022 Karl Schmeckpeper, Philip R. Osteen, Yufu Wang, Georgios Pavlakos, Kenneth Chaney, Wyatt Jordan, Xiaowei Zhou, Konstantinos G. Derpanis, Kostas Daniilidis

Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background.

Object Pose Estimation

LADDER: Latent Boundary-guided Adversarial Training

1 code implementation8 Jun 2022 Xiaowei Zhou, Ivor W. Tsang, Jie Yin

To achieve a better trade-off between standard accuracy and adversarial robustness, we propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining (LADDER) that adversarially trains DNN models on latent boundary-guided adversarial examples.

Adversarial Robustness

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

no code implementations CVPR 2018 Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.

Ranked #111 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation

Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach

no code implementations14 Sep 2015 Xiaowei Zhou, Menglong Zhu, Spyridon Leonardos, Kostas Daniilidis

We investigate the problem of estimating the 3D shape of an object defined by a set of 3D landmarks, given their 2D correspondences in a single image.

Ranked #118 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation

3D Shape Estimation from 2D Landmarks: A Convex Relaxation Approach

no code implementations CVPR 2015 Xiaowei Zhou, Spyridon Leonardos, Xiaoyan Hu, Kostas Daniilidis

We investigate the problem of estimating the 3D shape of an object, given a set of 2D landmarks in a single image.

Pose and Shape Estimation with Discriminatively Learned Parts

no code implementations1 Feb 2015 Menglong Zhu, Xiaowei Zhou, Kostas Daniilidis

We introduce a new approach for estimating the 3D pose and the 3D shape of an object from a single image.

Low-Rank Modeling and Its Applications in Image Analysis

no code implementations15 Jan 2014 Xiaowei Zhou, Can Yang, Hongyu Zhao, Weichuan Yu

In this paper, we review the recent advance of low-rank modeling, the state-of-the-art algorithms, and related applications in image analysis.

Collaborative Filtering Matrix Completion

Active Contours with Group Similarity

no code implementations CVPR 2013 Xiaowei Zhou, Xiaojie Huang, James S. Duncan, Weichuan Yu

In this paper, we propose to use the group similarity of object shapes in multiple images as a prior to aid segmentation, which can be interpreted as an unsupervised approach of shape prior modeling.

Image Segmentation Semantic Segmentation

Single Image Pop-Up From Discriminatively Learned Parts

no code implementations ICCV 2015 Menglong Zhu, Xiaowei Zhou, Kostas Daniilidis

We introduce a new approach for estimating a fine grained 3D shape and continuous pose of an object from a single image.

Latent Adversarial Defence with Boundary-guided Generation

no code implementations16 Jul 2019 Xiaowei Zhou, Ivor W. Tsang, Jie Yin

The proposed LAD method improves the robustness of a DNN model through adversarial training on generated adversarial examples.

Learning Hybrid Representations for Automatic 3D Vessel Centerline Extraction

no code implementations14 Dec 2020 Jiafa He, Chengwei Pan, Can Yang, Ming Zhang, Yang Wang, Xiaowei Zhou, Yizhou Yu

The main idea is to use CNNs to learn local appearances of vessels in image crops while using another point-cloud network to learn the global geometry of vessels in the entire image.

Representation Learning

Human-Understandable Decision Making for Visual Recognition

no code implementations5 Mar 2021 Xiaowei Zhou, Jie Yin, Ivor Tsang, Chen Wang

The widespread use of deep neural networks has achieved substantial success in many tasks.

Decision Making

Generative Transition Mechanism to Image-to-Image Translation via Encoded Transformation

no code implementations9 Mar 2021 Yaxin Shi, Xiaowei Zhou, Ping Liu, Ivor Tsang

To benefit the generalization ability of the translation model, we propose transition encoding to facilitate explicit regularization of these two {kinds} of consistencies on unseen transitions.

Attribute Image Reconstruction +2

Edge but not Least: Cross-View Graph Pooling

no code implementations24 Sep 2021 Xiaowei Zhou, Jie Yin, Ivor W. Tsang

Graph neural networks have emerged as a powerful model for graph representation learning to undertake graph-level prediction tasks.

Graph Classification Graph Regression +1

You Don't Only Look Once: Constructing Spatial-Temporal Memory for Integrated 3D Object Detection and Tracking

no code implementations ICCV 2021 Jiaming Sun, Yiming Xie, Siyu Zhang, Linghao Chen, Guofeng Zhang, Hujun Bao, Xiaowei Zhou

In this work, we propose a novel system for integrated 3D object detection and tracking, which uses a dynamic object occupancy map and previous object states as spatial-temporal memory to assist object detection in future frames.

3D Object Detection Object +2

Efficient Neural Radiance Fields for Interactive Free-viewpoint Video

no code implementations2 Dec 2021 Haotong Lin, Sida Peng, Zhen Xu, Yunzhi Yan, Qing Shuai, Hujun Bao, Xiaowei Zhou

We propose a novel scene representation, called ENeRF, for the fast creation of interactive free-viewpoint videos.

Depth Estimation Depth Prediction +1

Reconstructing Hand-Held Objects from Monocular Video

no code implementations30 Nov 2022 Di Huang, Xiaopeng Ji, Xingyi He, Jiaming Sun, Tong He, Qing Shuai, Wanli Ouyang, Xiaowei Zhou

The key idea is that the hand motion naturally provides multiple views of the object and the motion can be reliably estimated by a hand pose tracker.

Hand Pose Estimation Object

Ponder: Point Cloud Pre-training via Neural Rendering

no code implementations ICCV 2023 Di Huang, Sida Peng, Tong He, Honghui Yang, Xiaowei Zhou, Wanli Ouyang

We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural rendering.

3D Reconstruction Image Generation +2

Painting 3D Nature in 2D: View Synthesis of Natural Scenes from a Single Semantic Mask

no code implementations CVPR 2023 Shangzhan Zhang, Sida Peng, Tianrun Chen, Linzhan Mou, Haotong Lin, Kaicheng Yu, Yiyi Liao, Xiaowei Zhou

We introduce a novel approach that takes a single semantic mask as input to synthesize multi-view consistent color images of natural scenes, trained with a collection of single images from the Internet.

3D-Aware Image Synthesis

Representing Volumetric Videos as Dynamic MLP Maps

no code implementations CVPR 2023 Sida Peng, Yunzhi Yan, Qing Shuai, Hujun Bao, Xiaowei Zhou

This paper introduces a novel representation of volumetric videos for real-time view synthesis of dynamic scenes.

Long-term Visual Localization with Mobile Sensors

no code implementations CVPR 2023 Shen Yan, Yu Liu, Long Wang, Zehong Shen, Zhen Peng, Haomin Liu, Maojun Zhang, Guofeng Zhang, Xiaowei Zhou

Despite the remarkable advances in image matching and pose estimation, image-based localization of a camera in a temporally-varying outdoor environment is still a challenging problem due to huge appearance disparity between query and reference images caused by illumination, seasonal and structural changes.

Image-Based Localization Pose Estimation +1

UTSGAN: Unseen Transition Suss GAN for Transition-Aware Image-to-image Translation

no code implementations24 Apr 2023 Yaxin Shi, Xiaowei Zhou, Ping Liu, Ivor W. Tsang

Furthermore, we propose the use of transition consistency, defined on the transition variable, to enable regularization of consistency on unobserved translations, which is omitted in previous works.

Attribute Image-to-Image Translation +1

EasyHeC: Accurate and Automatic Hand-eye Calibration via Differentiable Rendering and Space Exploration

no code implementations2 May 2023 Linghao Chen, Yuzhe Qin, Xiaowei Zhou, Hao Su

Hand-eye calibration is a critical task in robotics, as it directly affects the efficacy of critical operations such as manipulation and grasping.

Learning Human Mesh Recovery in 3D Scenes

no code implementations CVPR 2023 Zehong Shen, Zhi Cen, Sida Peng, Qing Shuai, Hujun Bao, Xiaowei Zhou

We present a novel method for recovering the absolute pose and shape of a human in a pre-scanned scene given a single image.

Human Mesh Recovery

Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields

no code implementations24 Jul 2023 Shangzhan Zhang, Sida Peng, Yinji ShenTu, Qing Shuai, Tianrun Chen, Kaicheng Yu, Hujun Bao, Xiaowei Zhou

We extensively evaluate our approach on various scenes and show that our approach achieves spatially and temporally consistent editing results.

Relightable and Animatable Neural Avatar from Sparse-View Video

no code implementations15 Aug 2023 Zhen Xu, Sida Peng, Chen Geng, Linzhan Mou, Zihan Yan, Jiaming Sun, Hujun Bao, Xiaowei Zhou

Based on the HDQ algorithm, we leverage sphere tracing to efficiently estimate the surface intersection and light visibility.

Inverse Rendering

Deep Active Contours for Real-time 6-DoF Object Tracking

no code implementations ICCV 2023 Long Wang, Shen Yan, Jianan Zhen, Yu Liu, Maojun Zhang, Guofeng Zhang, Xiaowei Zhou

Specifically, given an initial pose, we project the object model to the image plane to obtain the initial contour and use a lightweight network to predict how the contour should move to match the true object boundary, which provides the gradients to optimize the object pose.

Computational Efficiency Object +1

Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models

no code implementations ICCV 2023 Huaijin Pi, Sida Peng, Minghui Yang, Xiaowei Zhou, Hujun Bao

This paper presents a novel approach to generating the 3D motion of a human interacting with a target object, with a focus on solving the challenge of synthesizing long-range and diverse motions, which could not be fulfilled by existing auto-regressive models or path planning-based methods.

Human-Object Interaction Detection

4K4D: Real-Time 4D View Synthesis at 4K Resolution

no code implementations17 Oct 2023 Zhen Xu, Sida Peng, Haotong Lin, Guangzhao He, Jiaming Sun, Yujun Shen, Hujun Bao, Xiaowei Zhou

Experiments show that our representation can be rendered at over 400 FPS on the DNA-Rendering dataset at 1080p resolution and 80 FPS on the ENeRF-Outdoor dataset at 4K resolution using an RTX 4090 GPU, which is 30x faster than previous methods and achieves the state-of-the-art rendering quality.

4k

SAM-guided Graph Cut for 3D Instance Segmentation

no code implementations13 Dec 2023 Haoyu Guo, He Zhu, Sida Peng, Yuang Wang, Yujun Shen, Ruizhen Hu, Xiaowei Zhou

Experimental results on the ScanNet, ScanNet++ and KITTI-360 datasets demonstrate that our method achieves robust segmentation performance and can generalize across different types of scenes.

3D Instance Segmentation Segmentation +1

Street Gaussians for Modeling Dynamic Urban Scenes

no code implementations2 Jan 2024 Yunzhi Yan, Haotong Lin, Chenxu Zhou, Weijie Wang, Haiyang Sun, Kun Zhan, Xianpeng Lang, Xiaowei Zhou, Sida Peng

We introduce Street Gaussians, a new explicit scene representation that tackles all these limitations.

AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model

no code implementations27 Jan 2024 Beijia Chen, Yuefan Shen, Qing Shuai, Xiaowei Zhou, Kun Zhou, Youyi Zheng

In this paper, we introduce AniDress, a novel method for generating animatable human avatars in loose clothes using very sparse multi-view videos (4-8 in our setting).

SpatialTracker: Tracking Any 2D Pixels in 3D Space

no code implementations5 Apr 2024 Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou

Recovering dense and long-range pixel motion in videos is a challenging problem.

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

no code implementations17 Apr 2024 Xi Chen, Sida Peng, Dongchen Yang, YuAn Liu, Bowen Pan, Chengfei Lv, Xiaowei Zhou

This paper aims to recover object materials from posed images captured under an unknown static lighting condition.

Cannot find the paper you are looking for? You can Submit a new open access paper.