no code implementations • 22 Jun 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
To ensure high discrimination, we propose a Feature Restoration (FR) operation to distill task-relevant features from the residual information and use them to compensate for the aligned features.
Ranked #82 on
Domain Generalization
on PACS
no code implementations • 8 Jun 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen, Shih-Fu Chang
There is a lack of loss design which enables the joint optimization of multiple instances (of multiple classes) within per-query optimization for person ReID.
no code implementations • ECCV 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
To address this problem, we introduce a global distance-distributions separation (GDS) constraint over the two distributions to encourage the clear separation of positive and negative samples from a global view.
1 code implementation • CVPR 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen, Li Zhang
Existing fully-supervised person re-identification (ReID) methods usually suffer from poor generalization capability caused by domain gaps.
Ranked #9 on
Unsupervised Domain Adaptation
on Market to Duke
2 code implementations • ECCV 2020 • Hanyue Tu, Chunyu Wang, Wen-Jun Zeng
In contrast to the previous efforts which require to establish cross-view correspondence based on noisy and incomplete 2D pose estimations, we present an end-to-end solution which directly operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space.
Ranked #6 on
3D Multi-Person Pose Estimation
on Panoptic
(using extra training data)
no code implementations • CVPR 2020 • Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng
Based on the probability space, we further generate new fusion strategies which achieve the state-of-the-art performance on four well-known action recognition datasets.
33 code implementations • 4 Apr 2020 • Yifu Zhang, Chunyu Wang, Xinggang Wang, Wen-Jun Zeng, Wenyu Liu
Formulating MOT as multi-task learning of object detection and re-ID in a single network is appealing since it allows joint optimization of the two tasks and enjoys high computation efficiency.
Ranked #1 on
Multi-Object Tracking
on 2DMOT15
(using extra training data)
no code implementations • CVPR 2020 • Guangting Wang, Chong Luo, Xiaoyan Sun, Zhiwei Xiong, Wen-Jun Zeng
We propose a principled three-step approach to build a high-performance tracker.
no code implementations • CVPR 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-aided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into a discriminative video-level feature representation.
1 code implementation • CVPR 2020 • Zhe Zhang, Chunyu Wang, Wenhu Qin, Wen-Jun Zeng
Then we lift the multi-view 2D poses to the 3D space by an Orientation Regularized Pictorial Structure Model (ORPSM) which jointly minimizes the projection error between the 3D and 2D poses, along with the discrepancy between the 3D pose and IMU orientations.
Ranked #1 on
3D Absolute Human Pose Estimation
on Total Capture
no code implementations • 15 Jan 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
To the best of our knowledge, we are the first to make use of multi-shots of an object in a teacher-student learning manner for effectively boosting the single image based re-id.
3 code implementations • Applications of Artificial Intelligence Conference 2019 • Dacheng Yin, Chong Luo, Zhiwei Xiong, Wen-Jun Zeng
We discover that the two streams should communicate with each other, and this is crucial to phase prediction.
Sound Audio and Speech Processing
no code implementations • ICCV 2019 • Junsheng Zhou, Yuwang Wang, Kaihuai Qin, Wen-Jun Zeng
Unsupervised depth learning takes the appearance difference between a target view and a view synthesized from its adjacent frame as supervisory signal.
Monocular Depth Estimation
Vocal Bursts Intensity Prediction
no code implementations • ICCV 2019 • Junsheng Zhou, Yuwang Wang, Kaihuai Qin, Wen-Jun Zeng
Our experimental evaluation demonstrates that the result of our method is comparable to fully supervised methods on the NYU Depth V2 benchmark.
no code implementations • 26 Sep 2019 • Huapeng Wu, Zhengxia Zou, Jie Gui, Wen-Jun Zeng, Jieping Ye, Jun Zhang, Hongyi Liu, Zhihui Wei
In this paper, we make a thorough investigation on the attention mechanisms in a SR model and shed light on how simple and effective improvements on these ideas improve the state-of-the-arts.
no code implementations • 3 Sep 2019 • Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng
For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i. e., attention, to each element/dimension of the input.
Ranked #3 on
Skeleton Based Action Recognition
on SYSU 3D
1 code implementation • ICCV 2019 • Haibo Qiu, Chunyu Wang, Jingdong Wang, Naiyan Wang, Wen-Jun Zeng
It consists of two separate steps: (1) estimating the 2D poses in multi-view images and (2) recovering the 3D poses from the multi-view 2D poses.
Ranked #6 on
3D Human Pose Estimation
on Total Capture
1 code implementation • 23 Jun 2019 • Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng
Accordingly, a hybrid network representation is presented which enables us to leverage the Variational Dropout so that the approximation of the posterior distribution becomes fully gradient-based and highly efficient.
1 code implementation • 30 May 2019 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Guoqiang Wei, Zhibo Chen
Specifically, we build a Semantics Aligning Network (SAN) which consists of a base network as encoder (SA-Enc) for re-ID, and a decoder (SA-Dec) for reconstructing/regressing the densely semantics aligned full texture image.
no code implementations • 24 May 2019 • Zheng Wang, Zhixiang Wang, Yinqiang Zheng, Yang Wu, Wen-Jun Zeng, Shin'ichi Satoh
An efficient and effective person re-identification (ReID) system relieves the users from painful and boring video watching and accelerates the process of video analysis.
no code implementations • 17 Apr 2019 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhizheng Zhang, Zhibo Chen
We achieve this by the context interaction among the features of different scales.
no code implementations • CVPR 2019 • Guangting Wang, Chong Luo, Zhiwei Xiong, Wen-Jun Zeng
The two stages are connected in series as the input proposals of the FM stage are generated by the CM stage.
1 code implementation • CVPR 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Xin Jin, Zhibo Chen
For person re-identification (re-id), attention mechanisms have become attractive as they aim at strengthening discriminative features and suppressing irrelevant ones, which matches well the key of re-id, i. e., discriminative feature learning.
no code implementations • 3 Apr 2019 • Wentong Liao, Cuiling Lan, Wen-Jun Zeng, Michael Ying Yang, Bodo Rosenhahn
We further explore more powerful representations by integrating language prior with the visual context in the transformation for the scene graph generation.
2 code implementations • CVPR 2020 • Pengfei Zhang, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng
Skeleton-based human action recognition has attracted great interest thanks to the easy accessibility of the human skeleton data.
Ranked #1 on
Skeleton Based Action Recognition
on SYSU 3D
1 code implementation • 11 Mar 2019 • Ren Yang, Xiaoyan Sun, Mai Xu, Wen-Jun Zeng
The past decade has witnessed great success in applying deep learning to enhance the quality of compressed video.
no code implementations • 30 Jan 2019 • Guoqiang Wei, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
The diversity of capturing viewpoints and the flexibility of the human poses, however, remain some significant challenges.
no code implementations • CVPR 2019 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
We propose a densely semantically aligned person re-identification framework.
no code implementations • 13 Nov 2018 • Hao Luo, Wenxuan Xie, Xinggang Wang, Wen-Jun Zeng
Trackers are in general more efficient than detectors but bear the risk of drifting.
no code implementations • 11 Sep 2018 • Xiaolin Song, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jingyu Yang, Xiaoyan Sun
We propose a video level 2D feature representation by transforming the convolutional features of all frames to a 2D feature map, referred to as VideoMap.
Ranked #53 on
Action Recognition
on UCF101
no code implementations • 5 Sep 2018 • Anfeng He, Chong Luo, Xinmei Tian, Wen-Jun Zeng
Recently, Siamese network based trackers have received tremendous interest for their fast tracking speed and high performance.
Ranked #10 on
Visual Object Tracking
on VOT2017/18
no code implementations • ECCV 2018 • Jieru Mei, Chunyu Wang, Wen-Jun Zeng
The archetypes generally correspond to the extremal points in the dataset and are learned by requiring them to be convex combinations of the training data.
no code implementations • ECCV 2018 • Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng
We propose adding a simple yet effective Element-wiseAttention Gate (EleAttG) to an RNN block (e. g., all RNN neurons in a network layer) that empowers the RNN neurons to have the attentiveness capability.
Ranked #112 on
Skeleton Based Action Recognition
on NTU RGB+D
no code implementations • 19 Jun 2018 • Bi Li, Wenxuan Xie, Wen-Jun Zeng, Wenyu Liu
Generally, model update is formulated as an online learning problem where a target model is learned over the online training set.
Ranked #1 on
Visual Tracking
on OTB-100
no code implementations • CVPR 2018 • Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wen-Jun Zeng
Recent attempts use 3D convolutional neural networks (CNNs) to explore spatio-temporal information for human action recognition.
2 code implementations • 20 Apr 2018 • Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng
In order to alleviate the effects of view variations, this paper introduces a novel view adaptation scheme, which automatically determines the virtual observation viewpoints in a learning based data driven manner.
Ranked #1 on
Skeleton Based Action Recognition
on UWA3D
1 code implementation • CVPR 2018 • Anfeng He, Chong Luo, Xinmei Tian, Wen-Jun Zeng
SA-Siam is composed of a semantic branch and an appearance branch.
Ranked #1 on
Visual Object Tracking
on OTB-50
no code implementations • 30 Jan 2018 • Peng Tang, Chunyu Wang, Xinggang Wang, Wenyu Liu, Wen-Jun Zeng, Jingdong Wang
In particular, our method improves results by 8. 8% over the static image detector for fast moving objects.
1 code implementation • 12 Dec 2017 • Boyi Li, Wenqi Ren, Dengpan Fu, DaCheng Tao, Dan Feng, Wen-Jun Zeng, Zhangyang Wang
We present a comprehensive study and evaluation of existing single image dehazing algorithms, using a new large-scale benchmark consisting of both synthetic and real-world hazy images, called REalistic Single Image DEhazing (RESIDE).
no code implementations • ICCV 2017 • Ke Sun, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Dong Liu, Jingdong Wang
We present a two-stage normalization scheme, human body normalization and limb normalization, to make the distribution of the relative joint locations compact, resulting in easier learning of convolutional spatial models and more accurate pose estimation.
1 code implementation • ICCV 2017 • Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng
Rather than re-positioning the skeletons based on a human defined prior criterion, we design a view adaptive recurrent neural network (RNN) with LSTM architecture, which enables the network itself to adapt to the most suitable observation viewpoints from end to end.
Ranked #6 on
Skeleton Based Action Recognition
on SYSU 3D
4 code implementations • 23 Nov 2016 • Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wen-Jun Zeng
A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow.
no code implementations • 18 Nov 2016 • Sijie Song, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jiaying Liu
In this work, we propose an end-to-end spatial and temporal attention model for human action recognition from skeleton data.
Ranked #122 on
Skeleton Based Action Recognition
on NTU RGB+D
no code implementations • 13 Jun 2016 • Jiaying Liu, Wenhan Yang, Xiaoyan Sun, Wen-Jun Zeng
With the rapid development of social network and multimedia technology, customized image and video stylization has been widely used for various social-media applications.
2 code implementations • 25 May 2016 • Jingdong Wang, Zhen Wei, Ting Zhang, Wen-Jun Zeng
Second, in our suggested fused net formed by one deep and one shallow base networks, the flows of the information from the earlier intermediate layer of the deep base network to the output and from the input to the later intermediate layer of the deep base network are both improved.
1 code implementation • 19 Apr 2016 • Yanghao Li, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Chunfeng Yuan, Jiaying Liu
In this paper, we study the problem of online action detection from streaming skeleton data.
no code implementations • 24 Mar 2016 • Wentao Zhu, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Yanghao Li, Li Shen, Xiaohui Xie
Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.
no code implementations • CVPR 2015 • Lizhi Wang, Zhiwei Xiong, Dahua Gao, Guangming Shi, Wen-Jun Zeng, Feng Wu
We propose a novel dual-camera design to acquire 4D high-speed hyperspectral (HSHS) videos with high spatial and spectral resolution.