Search Results for author: Wen-Jun Zeng

Found 49 papers, 18 papers with code

Feature Alignment and Restoration for Domain Generalization and Adaptation

no code implementations22 Jun 2020 Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

To ensure high discrimination, we propose a Feature Restoration (FR) operation to distill task-relevant features from the residual information and use them to compensate for the aligned features.

Disentanglement Domain Generalization +1

Beyond Triplet Loss: Meta Prototypical N-tuple Loss for Person Re-identification

no code implementations8 Jun 2020 Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen, Shih-Fu Chang

There is a lack of loss design which enables the joint optimization of multiple instances (of multiple classes) within per-query optimization for person ReID.

Classification General Classification +3

Global Distance-distributions Separation for Unsupervised Person Re-identification

no code implementations ECCV 2020 Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

To address this problem, we introduce a global distance-distributions separation (GDS) constraint over the two distributions to encourage the clear separation of positive and negative samples from a global view.

Domain Adaptation POS +1

VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

2 code implementations ECCV 2020 Hanyue Tu, Chunyu Wang, Wen-Jun Zeng

In contrast to the previous efforts which require to establish cross-view correspondence based on noisy and incomplete 2D pose estimations, we present an end-to-end solution which directly operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space.

Ranked #5 on 3D Multi-Person Pose Estimation on Panoptic (using extra training data)

3D Multi-Person Pose Estimation

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View

no code implementations CVPR 2020 Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng

Based on the probability space, we further generate new fusion strategies which achieve the state-of-the-art performance on four well-known action recognition datasets.

Action Recognition In Videos Temporal Action Localization

FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking

30 code implementations4 Apr 2020 Yifu Zhang, Chunyu Wang, Xinggang Wang, Wen-Jun Zeng, Wenyu Liu

Formulating MOT as multi-task learning of object detection and re-ID in a single network is appealing since it allows joint optimization of the two tasks and enjoys high computation efficiency.

 Ranked #1 on Multi-Object Tracking on 2DMOT15 (using extra training data)

Fairness Multi-Object Tracking +3

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification

no code implementations CVPR 2020 Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-aided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into a discriminative video-level feature representation.

Video-Based Person Re-Identification

Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach

1 code implementation CVPR 2020 Zhe Zhang, Chunyu Wang, Wenhu Qin, Wen-Jun Zeng

Then we lift the multi-view 2D poses to the 3D space by an Orientation Regularized Pictorial Structure Model (ORPSM) which jointly minimizes the projection error between the 3D and 2D poses, along with the discrepancy between the 3D pose and IMU orientations.

2D Pose Estimation 3D Absolute Human Pose Estimation

Uncertainty-Aware Multi-Shot Knowledge Distillation for Image-Based Object Re-Identification

no code implementations15 Jan 2020 Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

To the best of our knowledge, we are the first to make use of multi-shots of an object in a teacher-student learning manner for effectively boosting the single image based re-id.

Knowledge Distillation

PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network

4 code implementations Applications of Artificial Intelligence Conference 2019 Dacheng Yin, Chong Luo, Zhiwei Xiong, Wen-Jun Zeng

We discover that the two streams should communicate with each other, and this is crucial to phase prediction.

Sound Audio and Speech Processing

Unsupervised High-Resolution Depth Learning From Videos With Dual Networks

no code implementations ICCV 2019 Junsheng Zhou, Yuwang Wang, Kaihuai Qin, Wen-Jun Zeng

Unsupervised depth learning takes the appearance difference between a target view and a view synthesized from its adjacent frame as supervisory signal.

Monocular Depth Estimation Vocal Bursts Intensity Prediction

Multi-grained Attention Networks for Single Image Super-Resolution

no code implementations26 Sep 2019 Huapeng Wu, Zhengxia Zou, Jie Gui, Wen-Jun Zeng, Jieping Ye, Jun Zhang, Hongyi Liu, Zhihui Wei

In this paper, we make a thorough investigation on the attention mechanisms in a SR model and shed light on how simple and effective improvements on these ideas improve the state-of-the-arts.

Feature Importance Image Super-Resolution

Cross View Fusion for 3D Human Pose Estimation

1 code implementation ICCV 2019 Haibo Qiu, Chunyu Wang, Jingdong Wang, Naiyan Wang, Wen-Jun Zeng

It consists of two separate steps: (1) estimating the 2D poses in multi-view images and (2) recovering the 3D poses from the multi-view 2D poses.

2D Pose Estimation 3D Human Pose Estimation +1

EleAtt-RNN: Adding Attentiveness to Neurons in Recurrent Neural Networks

no code implementations3 Sep 2019 Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng

For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i. e., attention, to each element/dimension of the input.

Action Recognition Gesture Recognition +1

Posterior-Guided Neural Architecture Search

1 code implementation23 Jun 2019 Yizhou Zhou, Xiaoyan Sun, Chong Luo, Zheng-Jun Zha, Wen-Jun Zeng

Accordingly, a hybrid network representation is presented which enables us to leverage the Variational Dropout so that the approximation of the posterior distribution becomes fully gradient-based and highly efficient.

Image Classification Neural Architecture Search +1

Semantics-Aligned Representation Learning for Person Re-identification

1 code implementation30 May 2019 Xin Jin, Cuiling Lan, Wen-Jun Zeng, Guoqiang Wei, Zhibo Chen

Specifically, we build a Semantics Aligning Network (SAN) which consists of a base network as encoder (SA-Enc) for re-ID, and a decoder (SA-Dec) for reconstructing/regressing the densely semantics aligned full texture image.

Person Re-Identification Representation Learning +1

Beyond Intra-modality: A Survey of Heterogeneous Person Re-identification

no code implementations24 May 2019 Zheng Wang, Zhixiang Wang, Yinqiang Zheng, Yang Wu, Wen-Jun Zeng, Shin'ichi Satoh

An efficient and effective person re-identification (ReID) system relieves the users from painful and boring video watching and accelerates the process of video analysis.

Person Re-Identification

Relation-Aware Global Attention for Person Re-identification

1 code implementation CVPR 2020 Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Xin Jin, Zhibo Chen

For person re-identification (re-id), attention mechanisms have become attractive as they aim at strengthening discriminative features and suppressing irrelevant ones, which matches well the key of re-id, i. e., discriminative feature learning.

Clustering Image Classification +2

Target-Tailored Source-Transformation for Scene Graph Generation

no code implementations3 Apr 2019 Wentong Liao, Cuiling Lan, Wen-Jun Zeng, Michael Ying Yang, Bodo Rosenhahn

We further explore more powerful representations by integrating language prior with the visual context in the transformation for the scene graph generation.

graph construction Graph Generation +4

Quality-Gated Convolutional LSTM for Enhancing Compressed Video

1 code implementation11 Mar 2019 Ren Yang, Xiaoyan Sun, Mai Xu, Wen-Jun Zeng

The past decade has witnessed great success in applying deep learning to enhance the quality of compressed video.

View Invariant 3D Human Pose Estimation

no code implementations30 Jan 2019 Guoqiang Wei, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen

The diversity of capturing viewpoints and the flexibility of the human poses, however, remain some significant challenges.

3D Human Pose Estimation 3D Pose Estimation

Temporal-Spatial Mapping for Action Recognition

no code implementations11 Sep 2018 Xiaolin Song, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jingyu Yang, Xiaoyan Sun

We propose a video level 2D feature representation by transforming the convolutional features of all frames to a 2D feature map, referred to as VideoMap.

Action Recognition Image Classification +3

Towards a Better Match in Siamese Network Based Visual Object Tracker

no code implementations5 Sep 2018 Anfeng He, Chong Luo, Xinmei Tian, Wen-Jun Zeng

Recently, Siamese network based trackers have received tremendous interest for their fast tracking speed and high performance.

Visual Object Tracking

Online Dictionary Learning for Approximate Archetypal Analysis

no code implementations ECCV 2018 Jieru Mei, Chunyu Wang, Wen-Jun Zeng

The archetypes generally correspond to the extremal points in the dataset and are learned by requiring them to be convex combinations of the training data.

Dictionary Learning

Adding Attentiveness to the Neurons in Recurrent Neural Networks

no code implementations ECCV 2018 Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng

We propose adding a simple yet effective Element-wiseAttention Gate (EleAttG) to an RNN block (e. g., all RNN neurons in a network layer) that empowers the RNN neurons to have the attentiveness capability.

Action Recognition Skeleton Based Action Recognition +1

Learning to Update for Object Tracking with Recurrent Meta-learner

no code implementations19 Jun 2018 Bi Li, Wenxuan Xie, Wen-Jun Zeng, Wenyu Liu

Generally, model update is formulated as an online learning problem where a target model is learned over the online training set.

Meta-Learning Visual Object Tracking +1

MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition

no code implementations CVPR 2018 Yizhou Zhou, Xiaoyan Sun, Zheng-Jun Zha, Wen-Jun Zeng

Recent attempts use 3D convolutional neural networks (CNNs) to explore spatio-temporal information for human action recognition.

Action Recognition Temporal Action Localization

View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition

2 code implementations20 Apr 2018 Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng

In order to alleviate the effects of view variations, this paper introduces a novel view adaptation scheme, which automatically determines the virtual observation viewpoints in a learning based data driven manner.

Action Recognition Skeleton Based Action Recognition +1

Benchmarking Single Image Dehazing and Beyond

1 code implementation12 Dec 2017 Boyi Li, Wenqi Ren, Dengpan Fu, DaCheng Tao, Dan Feng, Wen-Jun Zeng, Zhangyang Wang

We present a comprehensive study and evaluation of existing single image dehazing algorithms, using a new large-scale benchmark consisting of both synthetic and real-world hazy images, called REalistic Single Image DEhazing (RESIDE).

Benchmarking Image Dehazing +1

Human Pose Estimation using Global and Local Normalization

no code implementations ICCV 2017 Ke Sun, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Dong Liu, Jingdong Wang

We present a two-stage normalization scheme, human body normalization and limb normalization, to make the distribution of the relative joint locations compact, resulting in easier learning of convolutional spatial models and more accurate pose estimation.

Pose Estimation

View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data

1 code implementation ICCV 2017 Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng

Rather than re-positioning the skeletons based on a human defined prior criterion, we design a view adaptive recurrent neural network (RNN) with LSTM architecture, which enables the network itself to adapt to the most suitable observation viewpoints from end to end.

Action Recognition Skeleton Based Action Recognition +1

Deep Convolutional Neural Networks with Merge-and-Run Mappings

4 code implementations23 Nov 2016 Liming Zhao, Jingdong Wang, Xi Li, Zhuowen Tu, Wen-Jun Zeng

A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow.

Photo Stylistic Brush: Robust Style Transfer via Superpixel-Based Bipartite Graph

no code implementations13 Jun 2016 Jiaying Liu, Wenhan Yang, Xiaoyan Sun, Wen-Jun Zeng

With the rapid development of social network and multimedia technology, customized image and video stylization has been widely used for various social-media applications.

Style Transfer Superpixels

Deeply-Fused Nets

2 code implementations25 May 2016 Jingdong Wang, Zhen Wei, Ting Zhang, Wen-Jun Zeng

Second, in our suggested fused net formed by one deep and one shallow base networks, the flows of the information from the earlier intermediate layer of the deep base network to the output and from the input to the later intermediate layer of the deep base network are both improved.

Co-occurrence Feature Learning for Skeleton based Action Recognition using Regularized Deep LSTM Networks

no code implementations24 Mar 2016 Wentao Zhu, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Yanghao Li, Li Shen, Xiaohui Xie

Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.

Action Recognition Skeleton Based Action Recognition +1

High-Speed Hyperspectral Video Acquisition With a Dual-Camera Architecture

no code implementations CVPR 2015 Lizhi Wang, Zhiwei Xiong, Dahua Gao, Guangming Shi, Wen-Jun Zeng, Feng Wu

We propose a novel dual-camera design to acquire 4D high-speed hyperspectral (HSHS) videos with high spatial and spectral resolution.

Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.