Search Results for author: Wu Liu

Found 46 papers, 17 papers with code

Content-Consistent Matching for Domain Adaptive Semantic Segmentation

1 code implementation ECCV 2020 Guangrui Li, Guoliang Kang, Wu Liu, Yunchao Wei, Yi Yang

The target of CCM is to acquire those synthetic images that share similar distribution with the real ones in the target domain, so that the domain gap can be naturally alleviated by employing the content-consistent synthetic images for training.

Domain Adaptation Semantic Segmentation +1

Guided Saliency Feature Learning for Person Re-identification in Crowded Scenes

1 code implementation ECCV 2020 Lingxiao He, Wu Liu

More importantly, we propose a new matching approach, called Guided Adaptive Spatial Matching (GASM), which expects that each spatial feature in the query can find the most similar spatial features of a person in a gallery to match.

Person Re-Identification

Parsing is All You Need for Accurate Gait Recognition in the Wild

1 code implementation31 Aug 2023 Jinkai Zheng, Xinchen Liu, Shuai Wang, Lihao Wang, Chenggang Yan, Wu Liu

Furthermore, due to the lack of suitable datasets, we build the first parsing-based dataset for gait recognition in the wild, named Gait3D-Parsing, by extending the large-scale and challenging Gait3D dataset.

Gait Recognition in the Wild Human Parsing

Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification

no code implementations17 Jul 2023 Tengfei Liang, Yi Jin, Wu Liu, Tao Wang, Songhe Feng, Yidong Li

Visible-Infrared person Re-IDentification (VI-ReID) is a challenging cross-modality image retrieval task that aims to match pedestrians' images across visible and infrared cameras.

Image Classification Image Retrieval +3

TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments

3 code implementations CVPR 2023 Yu Sun, Qian Bao, Wu Liu, Tao Mei, Michael J. Black

Although the estimation of 3D human pose and shape (HPS) is rapidly progressing, current methods still cannot reliably estimate moving humans in global coordinates, which is critical for many applications.

3D Human Pose Estimation regression

Learning To Segment Every Referring Object Point by Point

no code implementations CVPR 2023 Mengxue Qu, Yu Wu, Yunchao Wei, Wu Liu, Xiaodan Liang, Yao Zhao

Extensive experiments show that our model achieves 52. 06% in terms of accuracy (versus 58. 93% in fully supervised setting) on RefCOCO+@testA, when only using 1% of the mask annotations.

Referring Expression Referring Expression Segmentation

WOC: A Handy Webcam-based 3D Online Chatroom

no code implementations2 Sep 2022 Chuanhang Yan, Yu Sun, Qian Bao, Jinhui Pang, Wu Liu, Tao Mei

We develop WOC, a webcam-based 3D virtual online chatroom for multi-person interaction, which captures the 3D motion of users and drives their individual 3D virtual avatars in real-time.

Delving into the Frequency: Temporally Consistent Human Motion Transfer in the Fourier Space

no code implementations1 Sep 2022 Guang Yang, Wu Liu, Xinchen Liu, Xiaoyan Gu, Juan Cao, Jintao Li

To close the frequency gap between the natural and synthetic videos, we propose a novel Frequency-based human MOtion TRansfer framework, named FreMOTR, which can effectively mitigate the spatial artifacts and the temporal inconsistency of the synthesized videos.

DeepFake Detection Face Swapping

Gait Recognition in the Wild with Multi-hop Temporal Switch

1 code implementation1 Sep 2022 Jinkai Zheng, Xinchen Liu, Xiaoyan Gu, Yaoqi Sun, Chuang Gan, Jiyong Zhang, Wu Liu, Chenggang Yan

Current methods that obtain state-of-the-art performance on in-the-lab benchmarks achieve much worse accuracy on the recently proposed in-the-wild datasets because these methods can hardly model the varied temporal dynamics of gait sequences in unconstrained scenes.

Gait Recognition in the Wild

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition

no code implementations1 Sep 2022 Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei

In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features.

Action Recognition

REMOT: A Region-to-Whole Framework for Realistic Human Motion Transfer

no code implementations1 Sep 2022 Quanwei Yang, Xinchen Liu, Wu Liu, Hongtao Xie, Xiaoyan Gu, Lingyun Yu, Yongdong Zhang

Human Video Motion Transfer (HVMT) aims to, given an image of a source person, generate his/her video that imitates the motion of the driving person.

SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding

1 code implementation27 Jul 2022 Mengxue Qu, Yu Wu, Wu Liu, Qiqi Gong, Xiaodan Liang, Olga Russakovsky, Yao Zhao, Yunchao Wei

Particularly, SiRi conveys a significant principle to the research of visual grounding, i. e., a better initialized vision-language encoder would help the model converge to a better local minimum, advancing the performance accordingly.

Visual Grounding

CAINNFlow: Convolutional block Attention modules and Invertible Neural Networks Flow for anomaly detection and localization tasks

no code implementations4 Jun 2022 Ruiqing Yan, Fan Zhang, Mengyuan Huang, Wu Liu, Dongyu Hu, Jinfeng Li, Qiang Liu, Jinrong Jiang, Qianjin Guo, Linghan Zheng

Detection of object anomalies is crucial in industrial processes, but unsupervised anomaly detection and localization is particularly important due to the difficulty of obtaining a large number of defective samples and the unpredictable types of anomalies in real life.

Unsupervised Anomaly Detection

Structured Two-stream Attention Network for Video Question Answering

no code implementations2 Jun 2022 Lianli Gao, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei, Heng Tao Shen

To date, visual question answering (VQA) (i. e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA.

Question Answering Video Question Answering +2

Gait Recognition in the Wild with Dense 3D Representations and A Benchmark

1 code implementation CVPR 2022 Jinkai Zheng, Xinchen Liu, Wu Liu, Lingxiao He, Chenggang Yan, Tao Mei

Based on Gait3D, we comprehensively compare our method with existing gait recognition approaches, which reflects the superior performance of our framework and the potential of 3D representations for gait recognition in the wild.

Gait Recognition in the Wild

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

no code implementations9 Mar 2022 Xiaodong Chen, Xinchen Liu, Wu Liu, Kun Liu, Dong Wu, Yongdong Zhang, Tao Mei

Therefore, researchers start to focus on a new task, Part-level Action Parsing (PAP), which aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video.

Action Parsing Action Recognition

MSO: Multi-Feature Space Joint Optimization Network for RGB-Infrared Person Re-Identification

no code implementations21 Oct 2021 Yajun Gao, Tengfei Liang, Yi Jin, Xiaoyan Gu, Wu Liu, Yidong Li, Congyan Lang

The RGB-infrared cross-modality person re-identification (ReID) task aims to recognize the images of the same identity between the visible modality and the infrared modality.

Cross-Modality Person Re-identification Person Re-Identification

CMTR: Cross-modality Transformer for Visible-infrared Person Re-identification

no code implementations18 Oct 2021 Tengfei Liang, Yi Jin, Yajun Gao, Wu Liu, Songhe Feng, Tao Wang, Yidong Li

The existing convolutional neural network-based methods mainly face the problem of insufficient perception of modalities' information, and can not learn good discriminative modality-invariant embeddings for identities, which limits their performance.

Cross-Modality Person Re-identification Person Re-Identification

A Baseline Framework for Part-level Action Parsing and Action Recognition

no code implementations7 Oct 2021 Xiaodong Chen, Xinchen Liu, Kun Liu, Wu Liu, Tao Mei

This technical report introduces our 2nd place solution to Kinetics-TPS Track on Part-level Action Parsing in ICCV DeeperAction Workshop 2021.

Action Parsing Action Recognition +1

Semi-Supervised Domain Generalizable Person Re-Identification

3 code implementations11 Aug 2021 Lingxiao He, Wu Liu, Jian Liang, Kecheng Zheng, Xingyu Liao, Peng Cheng, Tao Mei

Instead, we aim to explore multiple labeled datasets to learn generalized domain-invariant representations for person re-id, which is expected universally effective for each new-coming re-id scenario.

Ranked #16 on Person Re-Identification on Market-1501 (using extra training data)

Generalizable Person Re-identification Knowledge Distillation +1

FasterPose: A Faster Simple Baseline for Human Pose Estimation

no code implementations7 Jul 2021 Hanbin Dai, Hailin Shi, Wu Liu, Linfang Wang, Yinglu Liu, Tao Mei

By the experimental analysis, we find that the HR representation leads to a sharp increase of computational cost, while the accuracy improvement remains marginal compared with the low-resolution (LR) representation.

Pose Estimation

Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

no code implementations23 Apr 2021 Wu Liu, Qian Bao, Yu Sun, Tao Mei

We believe this survey will provide the readers with a deep and insightful understanding of monocular human pose estimation.

3D Human Pose Estimation

Group-aware Label Transfer for Domain Adaptive Person Re-identification

1 code implementation CVPR 2021 Kecheng Zheng, Wu Liu, Lingxiao He, Tao Mei, Jiebo Luo, Zheng-Jun Zha

In this paper, we propose a Group-aware Label Transfer (GLT) algorithm, which enables the online interaction and mutual promotion of pseudo-label prediction and representation learning.

Clustering Domain Adaptive Person Re-Identification +4

TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain Gait Recognition

1 code implementation9 Feb 2021 Jinkai Zheng, Xinchen Liu, Chenggang Yan, Jiyong Zhang, Wu Liu, XiaoPing Zhang, Tao Mei

Despite significant improvement in gait recognition with deep learning, existing studies still neglect a more practical but challenging scenario -- unsupervised cross-domain gait recognition which aims to learn a model on a labeled dataset then adapts it to an unlabeled dataset.

Gait Recognition

Neural Architecture Search for Joint Human Parsing and Pose Estimation

1 code implementation ICCV 2021 Dan Zeng, Yuhang Huang, Qian Bao, Junjie Zhang, Chi Su, Wu Liu

With the spirit of NAS, we propose to search for an efficient network architecture (NPPNet) to tackle two tasks at the same time.

Human Parsing Neural Architecture Search +1

Synthetic Training for Monocular Human Mesh Recovery

no code implementations27 Oct 2020 Yu Sun, Qian Bao, Wu Liu, Wenpeng Gao, Yili Fu, Chuang Gan, Tao Mei

To solve this problem, we design a multi-branch framework to disentangle the regression of different body properties, enabling us to separate each component's training in a synthetic training manner using unpaired data available.

Human Mesh Recovery

Beyond the Attention: Distinguish the Discriminative and Confusable Features For Fine-grained Image Classification

no code implementations12 Oct 2020 Xiruo Shi, Liutong Xu, Pengfei Wang, Yuanyuan Gao, Haifang Jian, Wu Liu

Specifically, LAFE utilizes the region attention modules and channel attention modules to extract discriminative features and confusable features respectively.

Classification Fine-Grained Image Classification +1

Hierarchical Gumbel Attention Network for Text-based Person Search

no code implementations10 Oct 2020 Kecheng Zheng, Wu Liu, Jiawei Liu, Zheng-Jun Zha, Tao Mei

This hard selection strategy is able to fuse the strong-relevant multi-modality features for alleviating the problem of matching redundancy.

Image Retrieval Image-to-Text Retrieval +5

Monocular, One-stage, Regression of Multiple 3D People

1 code implementation ICCV 2021 Yu Sun, Qian Bao, Wu Liu, Yili Fu, Michael J. Black, Tao Mei

Through a body-center-guided sampling process, the body mesh parameters of all people in the image are easily extracted from the Mesh Parameter map.

 Ranked #1 on 3D Multi-Person Mesh Recovery on Relative Human (using extra training data)

3D Depth Estimation 3D Multi-Person Mesh Recovery +2

Black Re-ID: A Head-shoulder Descriptor for the Challenging Problem of Person Re-Identification

no code implementations19 Aug 2020 Boqiang Xu, Lingxiao He, Xingyu Liao, Wu Liu, Zhenan Sun, Tao Mei

Given the input person image, the ensemble method would focus on the head-shoulder feature by assigning a larger weight if the individual insides the image is in black clothing.

Person Re-Identification

A Real-time Action Representation with Temporal Encoding and Deep Compression

no code implementations17 Jun 2020 Kun Liu, Wu Liu, Huadong Ma, Mingkui Tan, Chuang Gan

Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5. 4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.

Action Recognition

FastReID: A Pytorch Toolbox for General Instance Re-identification

2 code implementations4 Jun 2020 Lingxiao He, Xingyu Liao, Wu Liu, Xinchen Liu, Peng Cheng, Tao Mei

General Instance Re-identification is a very important task in the computer vision, which can be widely used in many practical applications, such as person/vehicle re-identification, face recognition, wildlife protection, commodity tracing, and snapshop, etc.. To meet the increasing application demand for general instance re-identification, we present FastReID as a widely used software system in JD AI Research.

Face Recognition Image Retrieval +2

Deep Recurrent Quantization for Generating Sequential Binary Codes

1 code implementation16 Jun 2019 Jingkuan Song, Xiaosu Zhu, Lianli Gao, Xin-Shun Xu, Wu Liu, Heng Tao Shen

To the end, when the model is trained, a sequence of binary codes can be generated and the code length can be easily controlled by adjusting the number of recurrent iterations.

Image Retrieval Quantization +1

PVSS: A Progressive Vehicle Search System for Video Surveillance Networks

no code implementations10 Jan 2019 Xinchen Liu, Wu Liu, Huadong Ma, Shuangqun Li

In this paper, a Progressive Vehicle Search System, named as PVSS, is designed to solve the above problems.

Multi-Granularity Reasoning for Social Relation Recognition from Images

no code implementations10 Jan 2019 Meng Zhang, Xinchen Liu, Wu Liu, Anfu Zhou, Huadong Ma, Tao Mei

To bridge the domain gap, we propose a Multi-Granularity Reasoning framework for social relation recognition from images.

KTAN: Knowledge Transfer Adversarial Network

no code implementations18 Oct 2018 Peiye Liu, Wu Liu, Huadong Ma, Tao Mei, Mingoo Seok

To transfer the knowledge of intermediate representations, we set high-level teacher feature maps as a target, toward which the student feature maps are trained.

Image Classification Knowledge Distillation +3

An efficient deep learning hashing neural network for mobile visual search

no code implementations21 Oct 2017 Heng Qi, Wu Liu, Liang Liu

Mobile visual search applications are emerging that enable users to sense their surroundings with smart phones.

Generalized Zero-Shot Learning for Action Recognition with Web-Scale Video Data

no code implementations20 Oct 2017 Kun Liu, Wu Liu, Huadong Ma, Wenbing Huang, Xiongxiong Dong

Motivated by this, we study the task of action recognition in surveillance video under a more realistic \emph{generalized zero-shot setting}, where testing data contains both seen and unseen classes.

Action Recognition Generalized Zero-Shot Learning +1

Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning

no code implementations5 Jun 2017 Jingkuan Song, Zhao Guo, Lianli Gao, Wu Liu, Dongxiang Zhang, Heng Tao Shen

Specifically, the proposed framework utilizes the temporal attention for selecting specific frames to predict the related words, while the adjusted temporal attention is for deciding whether to depend on the visual information or the language context information.

Language Modelling Video Captioning

Multi-Task Deep Visual-Semantic Embedding for Video Thumbnail Selection

no code implementations CVPR 2015 Wu Liu, Tao Mei, Yongdong Zhang, Cherry Che, Jiebo Luo

Given the tremendous growth of online videos, video thumbnail, as the common visualization form of video content, is becoming increasingly important to influence user's browsing and searching experience.

Multi-Task Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.