Search Results for author: Zhiguo Cao

Found 75 papers, 44 papers with code

Sparse-to-Dense Depth Completion Revisited: Sampling Strategy and Graph Construction

no code implementations ECCV 2020 Xin Xiong, Haipeng Xiong, Ke Xian, Chen Zhao, Zhiguo Cao, Xin Li

Depth completion is a widely studied problem of predicting a dense depth map from a sparse set of measurements and a single RGB image.

Depth Completion graph construction

3D Multi-frame Fusion for Video Stabilization

no code implementations19 Apr 2024 Zhan Peng, Xinyi Ye, Weiyue Zhao, Tianqi Liu, Huiqiang Sun, Baopu Li, Zhiguo Cao

In this paper, we present RStab, a novel framework for video stabilization that integrates 3D multi-frame fusion through volume rendering.

In-Context Matting

1 code implementation23 Mar 2024 He guo, Zixuan Ye, Zhiguo Cao, Hao Lu

We introduce in-context matting, a novel task setting of image matting.

Image Matting

CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner

no code implementations15 Mar 2024 Tingbing Yan, Wenzheng Zeng, Yang Xiao, Xingyu Tong, Bo Tan, Zhiwen Fang, Zhiguo Cao, Joey Tianyi Zhou

Most existing one-shot skeleton-based action recognition focuses on raw low-level information (e. g., joint location), and may suffer from local information loss and low generalization ability.

Skeleton Based Action Recognition

DyBluRF: Dynamic Neural Radiance Fields from Blurry Monocular Video

no code implementations15 Mar 2024 Huiqiang Sun, Xingyi Li, Liao Shen, Xinyi Ye, Ke Xian, Zhiguo Cao

Experimental results on our dataset demonstrate that our method outperforms existing approaches in generating sharp novel views from motion-blurred inputs while maintaining spatial-temporal consistency of the scene.

S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes

no code implementations10 Mar 2024 Xingyi Li, Zhiguo Cao, Yizheng Wu, Kewei Wang, Ke Xian, Zhe Wang, Guosheng Lin

To address this limitation, we present S-DyRF, a reference-based spatio-temporal stylization method for dynamic neural radiance fields.

Style Transfer

End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context

1 code implementation27 Oct 2023 Yiran Guan, Zhuoguang Chen, Wenzheng Zeng, Zhiguo Cao, Yang Xiao

In this letter, we propose a new method, Multi-Clue Gaze (MCGaze), to facilitate video gaze estimation via capturing spatial-temporal interaction context among head, face, and eye in an end-to-end learning way, which has not been well concerned yet.

Gaze Estimation

Point-Query Quadtree for Crowd Counting, Localization, and More

1 code implementation ICCV 2023 Chengxin Liu, Hao Lu, Zhiguo Cao, Tongliang Liu

Such a querying process yields an intuitive, universal modeling of crowd as both the input and output are interpretable and steerable.

Crowd Counting

Make-It-4D: Synthesizing a Consistent Long-Term Dynamic Scene Video from a Single Image

no code implementations20 Aug 2023 Liao Shen, Xingyi Li, Huiqiang Sun, Juewen Peng, Ke Xian, Zhiguo Cao, Guosheng Lin

To animate the visual content, the feature point cloud is displaced based on the scene flow derived from motion estimation and the corresponding camera pose.

Motion Estimation

Diffusion-Augmented Depth Prediction with Sparse Annotations

no code implementations4 Aug 2023 Jiaqi Li, Yiran Wang, Zihao Huang, Jinghong Zheng, Ke Xian, Zhiguo Cao, Jianming Zhang

We leverage the structural characteristics of diffusion model to enforce depth structures of depth models in a plug-and-play manner.

Autonomous Driving Depth Estimation +3

Box-DETR: Understanding and Boxing Conditional Spatial Queries

1 code implementation17 Jul 2023 Wenze Liu, Hao Lu, Yuliang Liu, Zhiguo Cao

In DAB-DETR, such queries are modulated by the so-called conditional linear projection at each decoder stage, aiming to search for positions of interest such as the four extremities of the box.

Defocus to focus: Photo-realistic bokeh rendering by fusing defocus and radiance priors

no code implementations7 Jun 2023 Xianrui Luo, Juewen Peng, Ke Xian, Zijin Wu, Zhiguo Cao

To this end, we present a Defocus to Focus (D2F) framework to learn realistic bokeh rendering by fusing defocus priors with the all-in-focus image and by implementing radiance priors in layered fusion.

Hallucination

Learning Probabilistic Coordinate Fields for Robust Correspondences

no code implementations7 Jun 2023 Weiyue Zhao, Hao Lu, Xinyi Ye, Zhiguo Cao, Xin Li

We introduce Probabilistic Coordinate Fields (PCFs), a novel geometric-invariant coordinate representation for image correspondence problems.

Image Registration Pose Estimation

A2B: Anchor to Barycentric Coordinate for Robust Correspondence

no code implementations5 Jun 2023 Weiyue Zhao, Hao Lu, Zhiguo Cao, Xin Li

This approach offers a new perspective to alleviate the problem of repeated patterns and emphasizes the importance of choosing coordinate representations for feature correspondences.

Vision Transformer Off-the-Shelf: A Surprising Baseline for Few-Shot Class-Agnostic Counting

no code implementations8 May 2023 Zhicheng Wang, Liwen Xiao, Zhiguo Cao, Hao Lu

This task is typically addressed by extracting the features of query image and exemplars respectively and then matching their feature similarity, leading to an extract-then-match paradigm.

Point-and-Shoot All-in-Focus Photo Synthesis from Smartphone Camera Pair

no code implementations11 Apr 2023 Xianrui Luo, Juewen Peng, Weiyue Zhao, Ke Xian, Hao Lu, Zhiguo Cao

Benefiting from the multi-camera module in modern smartphones, we introduce a new task of AIF synthesis from main (wide) and ultra-wide cameras.

A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image

1 code implementation CVPR 2023 Changlong Jiang, Yang Xiao, Cunlin Wu, Mingyang Zhang, Jinghong Zheng, Zhiguo Cao, Joey Tianyi Zhou

3D interacting hand pose estimation from a single RGB image is a challenging task, due to serious self-occlusion and inter-occlusion towards hands, confusing similar appearance patterns between 2 hands, ill-posed joint position mapping from 2D to 3D, etc.. To address these, we propose to extend A2J-the state-of-the-art depth-based 3D single hand pose estimation method-to RGB domain under interacting hand condition.

3D Interacting Hand Pose Estimation Hand Pose Estimation +1

Learning Second-Order Attentive Context for Efficient Correspondence Pruning

no code implementations28 Mar 2023 Xinyi Ye, Weiyue Zhao, Hao Lu, Zhiguo Cao

It is challenging because of the disorganized spatial distribution of numerous outliers, especially when putative correspondences are largely dominated by outliers.

Find Beauty in the Rare: Contrastive Composition Feature Clustering for Nontrivial Cropping Box Regression

no code implementations17 Feb 2023 Zhiyu Pan, Yinpeng Chen, Jiale Zhang, Hao Lu, Zhiguo Cao, Weicai Zhong

Observing that similar composition patterns tend to be shared by the cropping boundaries annotated nearly, we argue to find the beauty of composition from the rare samples by clustering the samples with similar cropping boundary annotations, ie, similar composition patterns.

Clustering Image Cropping +2

Matching Is Not Enough: A Two-Stage Framework for Category-Agnostic Pose Estimation

1 code implementation CVPR 2023 Min Shi, Zihao Huang, Xianzheng Ma, Xiaowei Hu, Zhiguo Cao

To calibrate the inaccurate matching results, we introduce a two-stage framework, where matched keypoints from the first stage are viewed as similarity-aware position proposals.

Category-Agnostic Pose Estimation Pose Estimation

Infusing Definiteness into Randomness: Rethinking Composition Styles for Deep Image Matting

1 code implementation27 Dec 2022 Zixuan Ye, Yutong Dai, Chaoyi Hong, Zhiguo Cao, Hao Lu

Inspired by this, we introduce a novel composition style that binds the source and combined foregrounds in a definite triplet.

Image Matting

SAPA: Similarity-Aware Point Affiliation for Feature Upsampling

2 code implementations26 Sep 2022 Hao Lu, Wenze Liu, Zixuan Ye, Hongtao Fu, Yuliang Liu, Zhiguo Cao

We introduce point affiliation into feature upsampling, a notion that describes the affiliation of each upsampled point to a semantic cluster formed by local decoder feature points with semantic similarity.

Depth Estimation Feature Upsampling +6

DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields

1 code implementation1 Aug 2022 Zijin Wu, Xingyi Li, Juewen Peng, Hao Lu, Zhiguo Cao, Weicai Zhong

To mitigate this issue, we introduce DoF-NeRF, a novel neural rendering approach that can deal with shallow DoF inputs and can simulate DoF effect.

Neural Rendering

Design What You Desire: Icon Generation from Orthogonal Application and Theme Labels

1 code implementation31 Jul 2022 Yinpeng Chen, Zhiyu Pan, Min Shi, Hao Lu, Zhiguo Cao, Weicai Zhong

Generative adversarial networks (GANs) have been trained to be professional artists able to create stunning artworks such as face generation and image style transfer.

Disentanglement Face Generation +1

FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling

no code implementations21 Jul 2022 Hao Lu, Wenze Liu, Hongtao Fu, Zhiguo Cao

We consider the problem of task-agnostic feature upsampling in dense prediction where an upsampling operator is required to facilitate both region-sensitive tasks like semantic segmentation and detail-sensitive tasks such as image matting.

Feature Upsampling Image Matting +1

Robust Object Detection With Inaccurate Bounding Boxes

1 code implementation20 Jul 2022 Chengxin Liu, Kewei Wang, Hao Lu, Zhiguo Cao, Ziming Zhang

As the crowd-sourcing labeling process and the ambiguities of the objects may raise noisy bounding box annotations, the object detectors will suffer from the degenerated training data.

Multiple Instance Learning Object +2

MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects

1 code implementation18 Jul 2022 Juewen Peng, Jianming Zhang, Xianrui Luo, Hao Lu, Ke Xian, Zhiguo Cao

Partial occlusion effects are a phenomenon that blurry objects near a camera are semi-transparent, resulting in partial appearance of occluded background.

3D Instances as 1D Kernels

1 code implementation15 Jul 2022 Yizheng Wu, Min Shi, Shuaiyuan Du, Hao Lu, Zhiguo Cao, Weicai Zhong

The idea of instance kernel is inspired by recent success of dynamic convolutions in 2D/3D instance segmentation.

Ranked #2 on 3D Instance Segmentation on S3DIS (mCov metric)

3D Instance Segmentation Semantic Segmentation

BokehMe: When Neural Rendering Meets Classical Rendering

1 code implementation CVPR 2022 Juewen Peng, Zhiguo Cao, Xianrui Luo, Hao Lu, Ke Xian, Jianming Zhang

Based on this formulation, we implement the classical renderer by a scattering-based method and propose a two-stage neural renderer to fix the erroneous areas from the classical renderer.

Neural Rendering

Interior Attention-Aware Network for Infrared Small Target Detection

1 code implementation IEEE Transactions on Geoscience and Remote Sensing 2022 Kewei Wang, Shuaiyuan Du, Chengxin Liu, Zhiguo Cao

Motivated by the fact that pixels from targets or backgrounds are correlated to each other, we propose a coarse-to-fine interior attention-aware network (IAANet) for infrared small target detection.

2D Object Detection 2D Semantic Segmentation

Composing Photos Like a Photographer

1 code implementation CVPR 2021 Chaoyi Hong, Shuaiyuan Du, Ke Xian, Hao Lu, Zhiguo Cao, Weicai Zhong

To this end, we introduce the concept of the key composition map (KCM) to encode the composition rules.

Image Cropping

On Efficient and Robust Metrics for RANSAC Hypotheses and 3D Rigid Registration

no code implementations10 Nov 2020 Jiaqi Yang, Zhiqiang Huang, Siwen Quan, Qian Zhang, Yanning Zhang, Zhiguo Cao

This paper focuses on developing efficient and robust evaluation metrics for RANSAC hypotheses to achieve accurate 3D rigid registration.

ECML: An Ensemble Cascade Metric Learning Mechanism towards Face Verification

1 code implementation11 Jul 2020 Fu Xiong, Yang Xiao, Zhiguo Cao, Yancheng Wang, Joey Tianyi Zhou, Jianxi Wu

Embedding RMML into the proposed ECML mechanism, our metric learning paradigm (EC-RMML) can run in the one-pass learning manner.

Face Verification Fine-Grained Visual Recognition +1

LRF-Net: Learning Local Reference Frames for 3D Local Shape Description and Matching

no code implementations22 Jan 2020 Angfan Zhu, Jiaqi Yang, Weiyue Zhao, Zhiguo Cao

The local reference frame (LRF) acts as a critical role in 3D local shape description and matching.

Pose Estimation

From Open Set to Closed Set: Supervised Spatial Divide-and-Conquer for Object Counting

3 code implementations7 Jan 2020 Haipeng Xiong, Hao Lu, Chengxin Liu, Liang Liu, Chunhua Shen, Zhiguo Cao

Visual counting, a task that aims to estimate the number of objects from an image/video, is an open-set problem by nature, i. e., the number of population can vary in [0, inf) in theory.

Object Counting

Rotation Invariant Point Cloud Classification: Where Local Geometry Meets Global Topology

1 code implementation1 Nov 2019 Chen Zhao, Jiaqi Yang, Xin Xiong, Angfan Zhu, Zhiguo Cao, Xin Li

To the best of our knowledge, this work is the first principled approach toward adaptively combining global and local information under the context of RI point cloud analysis.

General Classification Point Cloud Classification

Iterative Clustering with Game-Theoretic Matching for Robust Multi-consistency Correspondence

no code implementations3 Sep 2019 Chen Zhao, Jiaqi Yang, Ke Xian, Zhiguo Cao, Xin Li

Matching corresponding features between two images is a fundamental task to computer vision with numerous applications in object recognition, robotics, and 3D reconstruction.

3D Reconstruction Clustering +2

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

2 code implementations ICCV 2019 Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Tianyi Zhou, Junsong Yuan

For 3D hand and body pose estimation task in depth image, a novel anchor-based approach termed Anchor-to-Joint regression network (A2J) with the end-to-end learning ability is proposed.

3D Pose Estimation Depth Estimation +1

Comparative evaluation of 2D feature correspondence selection algorithms

1 code implementation30 Apr 2019 Chen Zhao, Jiaqi Yang, Yang Xiao, Zhiguo Cao

Correspondence selection aiming at seeking correct feature correspondences from raw feature matches is pivotal for a number of feature-matching-based tasks.

Learning to Fuse Local Geometric Features for 3D Rigid Data Matching

no code implementations27 Apr 2019 Jiaqi Yang, Chen Zhao, Ke Xian, Angfan Zhu, Zhiguo Cao

This paper presents a simple yet very effective data-driven approach to fuse both low-level and high-level local geometric features for 3D rigid data matching.

NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences

1 code implementation CVPR 2019 Chen Zhao, Zhiguo Cao, Chi Li, Xin Li, Jiaqi Yang

Feature correspondence selection is pivotal to many feature-matching based tasks in computer vision.

Towards Real-time Eyeblink Detection in The Wild:Dataset,Theory and Practices

no code implementations21 Feb 2019 Guilei Hu, Yang Xiao, Zhiguo Cao, Lubin Meng, Zhiwen Fang, Joey Tianyi Zhou, Junsong Yuan

Effective and real-time eyeblink detection is of wide-range applications, such as deception detection, drive fatigue detection, face anti-spoofing, etc.

Attribute Deception Detection +1

Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identification

1 code implementation29 Jul 2018 Fu Xiong, Yang Xiao, Zhiguo Cao, Kaicheng Gong, Zhiwen Fang, Joey Tianyi Zhou

Person re-identification is indeed a challenging visual recognition task due to the critical issues of human pose variation, human body occlusion, camera view variation, etc.

Open-Ended Question Answering Person Re-Identification

Deep attention-based classification network for robust depth prediction

1 code implementation11 Jul 2018 Ruibo Li, Ke Xian, Chunhua Shen, Zhiguo Cao, Hao Lu, Lingxiao Hang

However, robust depth prediction suffers from two challenging problems: a) How to extract more discriminative features for different scenes (compared to a single scene)?

Classification Deep Attention +5

Monocular Depth Estimation with Augmented Ordinal Depth Relationships

no code implementations2 Jun 2018 Yuanzhouhan Cao, Tianqi Zhao, Ke Xian, Chunhua Shen, Zhiguo Cao, Shugong Xu

In this paper, we propose to improve the performance of metric depth estimation with relative depths collected from stereo movie videos using existing stereo matching algorithm.

Depth Prediction Monocular Depth Estimation +2

Performance Evaluation of 3D Correspondence Grouping Algorithms

no code implementations6 Apr 2018 Jiaqi Yang, Ke Xian, Yang Xiao, Zhiguo Cao

This paper presents a thorough evaluation of several widely-used 3D correspondence grouping algorithms, motived by their significance in vision tasks relying on correct feature correspondences.

3D Object Recognition Point Cloud Registration +1

When Unsupervised Domain Adaptation Meets Tensor Representations

1 code implementation ICCV 2017 Hao Lu, Lei Zhang, Zhiguo Cao, Wei Wei, Ke Xian, Chunhua Shen, Anton Van Den Hengel

Domain adaption (DA) allows machine learning methods trained on data sampled from one distribution to be applied to data sampled from another.

Unsupervised Domain Adaptation

TasselNet: Counting maize tassels in the wild via local counts regression network

no code implementations7 Jul 2017 Hao Lu, Zhiguo Cao, Yang Xiao, Bohan Zhuang, Chunhua Shen

To our knowledge, this is the first time that a plant-related counting problem is considered using computer vision technologies under unconstrained field-based environment.

Plant Phenotyping regression

Cannot find the paper you are looking for? You can Submit a new open access paper.