CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

1 code implementation2 Oct 2023 Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Xiangtai Li, Wentao Liu, Chen Change Loy

However, when transferring the vision-language alignment of CLIP from global image representation to local region representation for the open-vocabulary dense prediction tasks, CLIP ViTs suffer from the domain shift from full images to local image regions.

Automatic Animation of Hair Blowing in Still Portrait Photos

no code implementations ICCV 2023 Wenpeng Xiao, Wentao Liu, Yitong Wang, Bernard Ghanem, Bing Li

Considering the complexity of hair structure, we innovatively treat hair wisp extraction as an instance segmentation problem, where a hair wisp is referred to as an instance.

Two-Stage Hybrid Supervision Framework for Fast, Low-resource, and Accurate Organ and Pan-cancer Segmentation in Abdomen CT

no code implementations11 Sep 2023 Wentao Liu, Tong Tian, Weijin Xu, Lemeng Wang, Haoyuan Li, Huihua Yang

Abdominal organ and tumour segmentation has many important clinical applications, such as organ quantification, surgical planning, and disease diagnosis.


GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

no code implementations28 Aug 2023 Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian, Ping Luo, Ji Wu

Multi-Label Image Recognition (MLIR) is a challenging task that aims to predict multiple object labels in a single image while modeling the complex relationships between labels and image regions.

DIAS: A Comprehensive Benchmark for DSA-sequence Intracranial Artery Segmentation

1 code implementation21 Jun 2023 Wentao Liu, Tong Tian, Lemeng Wang, Weijin Xu, Haoyuan Li, Wenyi Zhao, Xipeng Pan, Huihua Yang, Feng Gao, Yiming Deng, Ruisheng Su

Automatic segmentation of the intracranial artery (IA) in digital subtraction angiography (DSA) sequence is an essential step in diagnosing IA-related diseases and guiding neuro-interventional surgery.

Aligning Bag of Regions for Open-Vocabulary Object Detection

1 code implementation CVPR 2023 Size Wu, Wenwei Zhang, Sheng Jin, Wentao Liu, Chen Change Loy

The embeddings of regions in a bag are treated as embeddings of words in a sentence, and they are sent to the text encoder of a VLM to obtain the bag-of-regions embedding, which is learned to be aligned to the corresponding features extracted by a frozen VLM.

ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild

1 code implementation23 Aug 2022 Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

We propose a single-network approach, termed ZoomNet, to take into account the hierarchical structure of the full human body and solve the scale variation of different body parts.

Transforming the Interactive Segmentation for Medical Imaging

no code implementations20 Aug 2022 Wentao Liu, Chaofan Ma, Yuhuan Yang, Weidi Xie, Ya zhang

The goal of this paper is to interactively refine the automatic segmentation on challenging structures that fall behind human performance, either due to the scarcity of available annotations or the difficulty nature of the problem itself, for example, on segmenting cancer or small organs.

Combining Self-Training and Hybrid Architecture for Semi-supervised Abdominal Organ Segmentation

2 code implementations23 Jul 2022 Wentao Liu, Weijin Xu, Songlin Yan, Lemeng Wang, Haoyuan Li, Huihua Yang

Abdominal organ segmentation has many important clinical applications, such as organ quantification, surgical planning, and disease diagnosis.

3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal

1 code implementation22 Jul 2022 Hao Meng, Sheng Jin, Wentao Liu, Chen Qian, Mengxiang Lin, Wanli Ouyang, Ping Luo

Unlike most previous works that directly predict the 3D poses of two interacting hands simultaneously, we propose to decompose the challenging interacting hand pose estimation task and estimate the pose of each hand separately.

Pose for Everything: Towards Category-Agnostic Pose Estimation

1 code implementation21 Jul 2022 Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

In this paper, we introduce the task of Category-Agnostic Pose Estimation (CAPE), which aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.

PHTrans: Parallelly Aggregating Global and Local Representations for Medical Image Segmentation

2 code implementations9 Mar 2022 Wentao Liu, Tong Tian, Weijin Xu, Huihua Yang, Xipeng Pan, Songlin Yan, Lemeng Wang

In this paper, we propose a novel hybrid architecture for medical image segmentation called PHTrans, which parallelly hybridizes Transformer and CNN in main building blocks to produce hierarchical representations from global and local features and adaptively aggregate them, aiming to fully exploit their strengths to obtain better segmentation performance.

Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization

no code implementations ICLR 2022 Can Wang, Sheng Jin, Yingda Guan, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang

PL approaches apply pseudo-labels to unlabeled data, and then train the model with a combination of the labeled and pseudo-labeled data iteratively.

Perceptual Quality Assessment of Colored 3D Point Clouds

1 code implementation10 Nov 2021 Honglei Su, Qi Liu, Zhengfang Duanmu, Wentao Liu, Zhou Wang

In this work, we first build a large 3D point cloud database for subjective and objective quality assessment of point clouds.

Luminance Attentive Networks for HDR Image and Panorama Reconstruction

1 code implementation14 Sep 2021 Hanning Yu, Wentao Liu, Chengjiang Long, Bo Dong, Qin Zou, Chunxia Xiao

Based on this observation, we propose a novel normalization method called " HDR calibration " for HDR images stored in relative luminance, calibrating HDR images into a similar luminance scale according to the LDR images.

Joint Depth and Normal Estimation from Real-world Time-of-flight Raw Data

no code implementations8 Aug 2021 Rongrong Gao, Na Fan, Changlin Li, Wentao Liu, Qifeng Chen

We present a novel approach to joint depth and normal estimation for time-of-flight (ToF) sensors.

Human Pose Regression with Residual Log-likelihood Estimation

3 code implementations ICCV 2021 Jiefeng Li, Siyuan Bian, Ailing Zeng, Can Wang, Bo Pang, Wentao Liu, Cewu Lu

In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution.

3D Human Pose Estimation Multi-Person Pose Estimation +1

When Human Pose Estimation Meets Robustness: Adversarial Algorithms and Benchmarks

1 code implementation CVPR 2021 Jiahang Wang, Sheng Jin, Wentao Liu, Weizhong Liu, Chen Qian, Ping Luo

However, unlike human vision that is robust to various data corruptions such as blur and pixelation, current pose estimators are easily confused by these corruptions.

Quantifying Visual Image Quality: A Bayesian View

no code implementations30 Jan 2021 Zhengfang Duanmu, Wentao Liu, Zhongling Wang, Zhou Wang

Image quality assessment (IQA) models aim to establish a quantitative relationship between visual images and their perceptual quality by human observers.

SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation

1 code implementation ECCV 2020 Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao, Xiaowei Zhou

Recovering multi-person 3D poses with absolute scales from a single RGB image is a challenging problem due to the inherent depth and scale ambiguity from a single view.

HMOR: Hierarchical Multi-Person Ordinal Relations for Monocular Multi-Person 3D Pose Estimation

no code implementations ECCV 2020 Jiefeng Li, Can Wang, Wentao Liu, Chen Qian, Cewu Lu

The HMOR encodes interaction information as the ordinal relations of depths and angles hierarchically, which captures the body-part and joint level semantic and maintains global consistency at the same time.

Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation

no code implementations ECCV 2020 Sheng Jin, Wentao Liu, Enze Xie, Wenhai Wang, Chen Qian, Wanli Ouyang, Ping Luo

The modules of HGG can be trained end-to-end with the keypoint detection network and is able to supervise the grouping process in a hierarchical manner.

Whole-Body Human Pose Estimation in the Wild

2 code implementations ECCV 2020 Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo

This paper investigates the task of 2D human whole-body pose estimation, which aims to localize dense landmarks on the entire human body including face, hands, body, and feet.

3D Human Mesh Regression with Dense Correspondence

3 code implementations CVPR 2020 Wang Zeng, Wanli Ouyang, Ping Luo, Wentao Liu, Xiaogang Wang

This paper proposes a model-free 3D human mesh estimation framework, named DecoMR, which explicitly establishes the dense correspondence between the mesh and the local image features in the UV space (i. e. a 2D space used for texture mapping of 3D mesh).

3D Human Pose Estimation 3D Human Reconstruction +1

Omni-sourced Webly-supervised Learning for Video Recognition

3 code implementations ECCV 2020 Haodong Duan, Yue Zhao, Yuanjun Xiong, Wentao Liu, Dahua Lin

Then a joint-training strategy is proposed to deal with the domain gaps between multiple data sources and formats in webly-supervised learning.

TRB: A Novel Triplet Representation for Understanding 2D Human Body

2 code implementations ICCV 2019 Haodong Duan, Kwan-Yee Lin, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang

In this paper, we propose the Triplet Representation for Body (TRB) -- a compact 2D human body representation, with skeleton keypoints capturing human pose information and contour keypoints containing human shape information.

EgoFace: Egocentric Face Performance Capture and Videorealistic Reenactment

no code implementations26 May 2019 Mohamed Elgharib, Mallikarjun BR, Ayush Tewari, Hyeongwoo Kim, Wentao Liu, Hans-Peter Seidel, Christian Theobalt

Our lightweight setup allows operations in uncontrolled environments, and lends itself to telepresence applications such as video-conferencing from dynamic environments.

dipIQ: Blind Image Quality Assessment by Learning-to-Rank Discriminable Image Pairs

no code implementations13 Apr 2019 Kede Ma, Wentao Liu, Tongliang Liu, Zhou Wang, DaCheng Tao

One of the biggest challenges in learning BIQA models is the conflict between the gigantic image space (which is in the dimension of the number of image pixels) and the extremely limited reliable ground truth data for training.

Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation

no code implementations CVPR 2019 Xipeng Chen, Kwan-Yee Lin, Wentao Liu, Chen Qian, Xiaogang Wang, Liang Lin

Recent studies have shown remarkable advances in 3D human pose estimation from monocular images, with the help of large-scale in-door 3D datasets and sophisticated network architectures.

Person Search in Videos with One Portrait Through Visual and Temporal Links

2 code implementations ECCV 2018 Qingqiu Huang, Wentao Liu, Dahua Lin

In real-world applications, e. g. law enforcement and video retrieval, one often needs to search a certain person in long videos with just one portrait.

DRPose3D: Depth Ranking in 3D Human Pose Estimation

no code implementations23 May 2018 Min Wang, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, Lizhuang Ma

In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation.

3D Human Pose Estimation 3D Pose Estimation

