Search Results for author: Xiangyu Zhu

Found 65 papers, 25 papers with code

Person Re-identification by Local Maximal Occurrence Representation and Metric Learning

1 code implementation CVPR 2015 Shengcai Liao, Yang Hu, Xiangyu Zhu, Stan Z. Li

In this paper, we propose an effective feature representation called Local Maximal Occurrence (LOMO), and a subspace and metric learning method called Cross-view Quadratic Discriminant Analysis (XQDA).

Metric Learning Person Re-Identification

High-Fidelity Pose and Expression Normalization for Face Recognition in the Wild

no code implementations CVPR 2015 Xiangyu Zhu, Zhen Lei, Junjie Yan, Dong Yi, Stan Z. Li

Pose and expression normalization is a crucial step to recover the canonical view of faces under arbitrary conditions, so as to improve the face recognition performance.

Face Recognition Vocal Bursts Intensity Prediction

Face Alignment Across Large Poses: A 3D Solution

no code implementations CVPR 2016 Xiangyu Zhu, Zhen Lei, Xiaoming Liu, Hailin Shi, Stan Z. Li

Face alignment, which fits a face model to an image and extracts the semantic meanings of facial pixels, has been an important topic in CV community.

3D Face Reconstruction Face Alignment +2

Constrained Deep Metric Learning for Person Re-identification

no code implementations24 Nov 2015 Hailin Shi, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Yang Yang, Stan Z. Li

In this paper, we propose a novel CNN-based method to learn a discriminative metric with good robustness to the over-fitting problem in person re-identification.

Metric Learning Person Re-Identification

Embedding Deep Metric for Person Re-identication A Study Against Large Variations

no code implementations1 Nov 2016 Hailin Shi, Yang Yang, Xiangyu Zhu, Shengcai Liao, Zhen Lei, Wei-Shi Zheng, Stan Z. Li

From this point of view, selecting suitable positive i. e. intra-class) training samples within a local range is critical for training the CNN embedding, especially when the data has large intra-class variations.

Person Re-Identification

R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

1 code implementation29 Jun 2017 Yingying Jiang, Xiangyu Zhu, Xiaobing Wang, Shuli Yang, Wei Li, Hua Wang, Pei Fu, Zhenbo Luo

In this paper, we propose a novel method called Rotational Region CNN (R2CNN) for detecting arbitrary-oriented texts in natural scene images.

Region Proposal Scene Text Detection +1

FaceBoxes: A CPU Real-time Face Detector with High Accuracy

10 code implementations17 Aug 2017 Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, Stan Z. Li

The MSCL aims at enriching the receptive fields and discretizing anchors over different layers to handle faces of various scales.

Face Detection Vocal Bursts Intensity Prediction

S$^3$FD: Single Shot Scale-invariant Face Detector

3 code implementations17 Aug 2017 Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, Stan Z. Li

This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S$^3$FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces.

Face Detection

S3FD: Single Shot Scale-Invariant Face Detector

no code implementations ICCV 2017 Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, Stan Z. Li

This paper presents a real-time face detector, named Single Shot Scale-invariant Face Detector (S3FD), which performs superiorly on various scales of faces with a single deep neural network, especially for small faces.

Face Detection

Face Alignment in Full Pose Range: A 3D Total Solution

2 code implementations2 Apr 2018 Xiangyu Zhu, Xiaoming Liu, Zhen Lei, Stan Z. Li

In this paper, we propose to tackle these three challenges in an new alignment framework termed 3D Dense Face Alignment (3DDFA), in which a dense 3D Morphable Model (3DMM) is fitted to the image via Cascaded Convolutional Neural Networks.

3D Pose Estimation Depth Image Estimation +3

Face Synthesis for Eyeglass-Robust Face Recognition

1 code implementation4 Jun 2018 Jianzhu Guo, Xiangyu Zhu, Zhen Lei, Stan Z. Li

A feasible method is to collect large-scale face images with eyeglasses for training deep learning methods.

Face Generation Face Model +2

Large-scale Bisample Learning on ID Versus Spot Face Recognition

no code implementations8 Jun 2018 Xiangyu Zhu, Hao liu, Zhen Lei, Hailin Shi, Fan Yang, Dong Yi, Guo-Jun Qi, Stan Z. Li

In this paper, we propose a deep learning based large-scale bisample learning (LBL) method for IvS face recognition.

Face Recognition General Classification

SECaps: A Sequence Enhanced Capsule Model for Charge Prediction

no code implementations10 Oct 2018 Congqing He, Li Peng, Yuquan Le, JiaWei He, Xiangyu Zhu

In this paper, we propose a Sequence Enhanced Capsule model, dubbed as SECaps model, to relieve this problem.

Prior-Knowledge and Attention-based Meta-Learning for Few-Shot Learning

no code implementations11 Dec 2018 Yunxiao Qin, WeiGuo Zhang, Chenxu Zhao, Zezheng Wang, Xiangyu Zhu, Guo-Jun Qi, Jingping Shi, Zhen Lei

In this paper, inspired by the human cognition process which utilizes both prior-knowledge and vision attention in learning new knowledge, we present a novel paradigm of meta-learning approach with three developments to introduce attention mechanism and prior-knowledge for meta-learning.

Few-Shot Learning

Improving Face Anti-Spoofing by 3D Virtual Synthesis

2 code implementations2 Jan 2019 Jianzhu Guo, Xiangyu Zhu, Jinchuan Xiao, Zhen Lei, Genxun Wan, Stan Z. Li

Specifically, we consider a printed photo as a flat surface and mesh it into a 3D object, which is then randomly bent and rotated in 3D space.

Face Anti-Spoofing Face Recognition

Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection

no code implementations ICCV 2019 Lu Zhang, Xiangyu Zhu, Xiangyu Chen, Xu Yang, Zhen Lei, Zhi-Yong Liu

In this paper, we propose a novel Aligned Region CNN (AR-CNN) to handle the weakly aligned multispectral data in an end-to-end way.

Position

Two-phase Hair Image Synthesis by Self-Enhancing Generative Model

no code implementations28 Feb 2019 Haonan Qiu, Chuan Wang, Hang Zhu, Xiangyu Zhu, Jinjin Gu, Xiaoguang Han

Generating plausible hair image given limited guidance, such as sparse sketches or low-resolution image, has been made possible with the rise of Generative Adversarial Networks (GANs).

Image-to-Image Translation Super-Resolution +2

Grand Challenge of 106-Point Facial Landmark Localization

no code implementations9 May 2019 Yinglu Liu, Hao Shen, Yue Si, Xiaobo Wang, Xiangyu Zhu, Hailin Shi, Zhibin Hong, Hanqi Guo, Ziyuan Guo, Yanqin Chen, Bi Li, Teng Xi, Jun Yu, Haonian Xie, Guochen Xie, Mengyan Li, Qing Lu, Zengfu Wang, Shenqi Lai, Zhenhua Chai, Xiaoming Wei

However, previous competitions on facial landmark localization (i. e., the 300-W, 300-VW and Menpo challenges) aim to predict 68-point landmarks, which are incompetent to depict the structure of facial components.

Face Alignment Face Recognition +2

Learning Meta Face Recognition in Unseen Domains

5 code implementations CVPR 2020 Jianzhu Guo, Xiangyu Zhu, Chenxu Zhao, Dong Cao, Zhen Lei, Stan Z. Li

Face recognition systems are usually faced with unseen domains in real-world applications and show unsatisfactory performance due to their poor generalization.

Face Recognition Meta-Learning

Domain Balancing: Face Recognition on Long-Tailed Domains

no code implementations CVPR 2020 Dong Cao, Xiangyu Zhu, Xingyu Huang, Jianzhu Guo, Zhen Lei

Finally, we propose a Domain Balancing Margin (DBM) in the loss function to further optimize the feature space of the tail domains to improve generalization.

Face Recognition

Pixel-Face: A Large-Scale, High-Resolution Benchmark for 3D Face Reconstruction

no code implementations28 Aug 2020 Jiangjing Lyu, Xiaobo Li, Xiangyu Zhu, Cheng Cheng

It is also a challenging task due to the lack of high-quality datasets that can fuel current deep learning-based methods.

3D Face Reconstruction

Towards Fast, Accurate and Stable 3D Dense Face Alignment

3 code implementations ECCV 2020 Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, Stan Z. Li

Firstly, on the basis of a lightweight backbone, we propose a meta-joint optimization strategy to dynamically regress a small set of 3DMM parameters, which greatly enhances speed and accuracy simultaneously.

 Ranked #1 on 3D Face Reconstruction on Florence (Mean NME metric)

3D Face Modelling 3D Face Reconstruction +2

Face Forgery Detection by 3D Decomposition

no code implementations CVPR 2021 Xiangyu Zhu, Hao Wang, Hongyan Fei, Zhen Lei, Stan Z. Li

Detecting digital face manipulation has attracted extensive attention due to fake media's potential harms to the public.

DeepThermal: Combustion Optimization for Thermal Power Generating Units Using Offline Reinforcement Learning

no code implementations23 Feb 2021 Xianyuan Zhan, Haoran Xu, Yue Zhang, Xiangyu Zhu, Honglei Yin, Yu Zheng

Optimizing the combustion efficiency of a thermal power generating unit (TPGU) is a highly challenging and critical task in the energy industry.

Continuous Control Offline RL +2

Model-Based Offline Planning with Trajectory Pruning

1 code implementation16 May 2021 Xianyuan Zhan, Xiangyu Zhu, Haoran Xu

The recent offline reinforcement learning (RL) studies have achieved much progress to make RL usable in real-world systems by learning policies from pre-collected datasets without environment interaction.

Offline RL Reinforcement Learning (RL)

Constraints Penalized Q-learning for Safe Offline Reinforcement Learning

no code implementations19 Jul 2021 Haoran Xu, Xianyuan Zhan, Xiangyu Zhu

We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment.

Offline RL Q-Learning +2

OBJECT DYNAMICS DISTILLATION FOR SCENE DECOMPOSITION AND REPRESENTATION

no code implementations ICLR 2022 Qu Tang, Xiangyu Zhu, Zhen Lei, Zhaoxiang Zhang

In this paper, we work on object dynamics and propose Object Dynamics Distillation Network (ODDN), a framework that distillates explicit object dynamics (e. g., velocity) from sequential static representations.

Object Predict Future Video Frames +1

A Novel Sequence Tagging Framework for Consumer Event-Cause Extraction

no code implementations28 Oct 2021 Congqing He, Jie Zhang, Xiangyu Zhu, Huan Liu, Yukun Huang

To this end, we introduce a fresh perspective to revisit the relational event-cause extraction task and propose a novel sequence tagging framework, instead of extracting event types and events-causes separately.

Makeup216: Logo Recognition with Adversarial Attention Representations

no code implementations13 Dec 2021 Junjun Hu, Yanhao Zhu, Bo Zhao, Jiexin Zheng, Chenxu Zhao, Xiangyu Zhu, Kangle Wu, Darun Tang

One of the challenges of logo recognition lies in the diversity of forms, such as symbols, texts or a combination of both; further, logos tend to be extremely concise in design while similar in appearance, suggesting the difficulty of learning discriminative representations.

Logo Recognition

Multi-initialization Optimization Network for Accurate 3D Human Pose and Shape Estimation

no code implementations24 Dec 2021 Zhiwei Liu, Xiangyu Zhu, Lu Yang, Xiang Yan, Ming Tang, Zhen Lei, Guibo Zhu, Xuetao Feng, Yan Wang, Jinqiao Wang

In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism.

Ranked #65 on 3D Human Pose Estimation on 3DPW (MPJPE metric)

3D human pose and shape estimation 3D Reconstruction

HP-Capsule: Unsupervised Face Part Discovery by Hierarchical Parsing Capsule Network

no code implementations CVPR 2022 Chang Yu, Xiangyu Zhu, Xiaomei Zhang, Zidu Wang, Zhaoxiang Zhang, Zhen Lei

Capsule networks are designed to present the objects by a set of parts and their relationships, which provide an insight into the procedure of visual perception.

Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape

no code implementations9 Apr 2022 Xiangyu Zhu, Chang Yu, Di Huang, Zhen Lei, Hao Wang, Stan Z. Li

3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori.

Vocal Bursts Intensity Prediction

Weakly Aligned Feature Fusion for Multimodal Object Detection

no code implementations21 Apr 2022 Lu Zhang, Zhiyong Liu, Xiangyu Zhu, Zhan Song, Xu Yang, Zhen Lei, Hong Qiao

In this article, we propose a general multimodal detector named aligned region CNN (AR-CNN) to tackle the position shift problem.

Object object-detection +2

MVP-Human Dataset for 3D Human Avatar Reconstruction from Unconstrained Frames

1 code implementation24 Apr 2022 Xiangyu Zhu, Tingting Liao, Jiangjing Lyu, Xiang Yan, Yunfeng Wang, Kan Guo, Qiong Cao, Stan Z. Li, Zhen Lei

In this paper, we consider a novel problem of reconstructing a 3D human avatar from multiple unconstrained frames, independent of assumptions on camera calibration, capture space, and constrained actions.

Camera Calibration

Towards 3D Face Reconstruction in Perspective Projection: Estimating 6DoF Face Pose from Monocular Image

1 code implementation9 May 2022 Yueying Kao, Bowen Pan, Miao Xu, Jiangjing Lyu, Xiangyu Zhu, Yuanzhang Chang, Xiaobo Li, Zhen Lei

In 3D face reconstruction, orthogonal projection has been widely employed to substitute perspective projection to simplify the fitting process.

3D Face Reconstruction

When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning

2 code implementations23 May 2022 Jianxiong Li, Xianyuan Zhan, Haoran Xu, Xiangyu Zhu, Jingjing Liu, Ya-Qin Zhang

In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas.

D4RL Offline RL +2

Deep Learning for Human Parsing: A Survey

no code implementations29 Jan 2023 Xiaomei Zhang, Xiangyu Zhu, Ming Tang, Zhen Lei

Human parsing is a key topic in image processing with many applications, such as surveillance analysis, human-robot interaction, person search, and clothing category classification, among many others.

Human Parsing Person Search

Intrinsic Physical Concepts Discovery with Object-Centric Predictive Models

no code implementations CVPR 2023 Qu Tang, Xiangyu Zhu, Zhen Lei, Zhaoxiang Zhang

The ability to discover abstract physical concepts and understand how they work in the world through observing lies at the core of human intelligence.

EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation

2 code implementations ICCV 2023 Ziqiao Peng, HaoYu Wu, Zhenbo Song, Hao Xu, Xiangyu Zhu, Jun He, Hongyan Liu, Zhaoxin Fan

Specifically, we introduce the emotion disentangling encoder (EDE) to disentangle the emotion and content in the speech by cross-reconstructed speech signals with different emotion labels.

3D Face Animation Disentanglement

OTAvatar: One-shot Talking Face Avatar with Controllable Tri-plane Rendering

1 code implementation CVPR 2023 Zhiyuan Ma, Xiangyu Zhu, GuoJun Qi, Zhen Lei, Lei Zhang

In this paper, we propose One-shot Talking face Avatar (OTAvatar), which constructs face avatars by a generalized controllable tri-plane rendering solution so that each personalized avatar can be constructed from only one portrait as the reference.

NerVE: Neural Volumetric Edges for Parametric Curve Extraction from Point Cloud

no code implementations CVPR 2023 Xiangyu Zhu, Dong Du, Weikai Chen, Zhiyou Zhao, Yinyu Nie, Xiaoguang Han

We show that a simple network based on NerVE can already outperform the previous state-of-the-art methods by a great margin.

Keypoint Detection

Grouped Knowledge Distillation for Deep Face Recognition

no code implementations10 Apr 2023 Weisong Zhao, Xiangyu Zhu, Kaiwen Guo, Xiao-Yu Zhang, Zhen Lei

Therefore, we seek to probe the target logits to extract the primary knowledge related to face identity, and discard the others, to make the distillation more achievable for the student network.

Face Recognition Knowledge Distillation

SelfTalk: A Self-Supervised Commutative Training Diagram to Comprehend 3D Talking Faces

1 code implementation19 Jun 2023 Ziqiao Peng, Yihao Luo, Yue Shi, Hao Xu, Xiangyu Zhu, Jun He, Hongyan Liu, Zhaoxin Fan

To enhance the visual accuracy of generated lip movement while reducing the dependence on labeled data, we propose a novel framework SelfTalk, by involving self-supervision in a cross-modals network system to learn 3D talking faces.

3D Face Animation Lip Reading

3D Keypoint Estimation Using Implicit Representation Learning

no code implementations20 Jun 2023 Xiangyu Zhu, Dong Du, Haibin Huang, Chongyang Ma, Xiaoguang Han

Inspired by the recent success of advanced implicit representation in reconstruction tasks, we explore the idea of using an implicit field to represent keypoints.

Keypoint Estimation Representation Learning

Cross Architecture Distillation for Face Recognition

no code implementations26 Jun 2023 Weisong Zhao, Xiangyu Zhu, Zhixiang He, Xiao-Yu Zhang, Zhen Lei

Transformers have emerged as the superior choice for face recognition tasks, but their insufficient platform acceleration hinders their application on mobile devices.

Face Recognition Knowledge Distillation

H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps

no code implementations22 Sep 2023 Haoyi Niu, Tianying Ji, Bingqi Liu, Haocheng Zhao, Xiangyu Zhu, Jianying Zheng, Pengfei Huang, Guyue Zhou, Jianming Hu, Xianyuan Zhan

Solving real-world complex tasks using reinforcement learning (RL) without high-fidelity simulation environments or large amounts of offline data can be quite challenging.

Offline RL Reinforcement Learning (RL)

Visual Commonsense based Heterogeneous Graph Contrastive Learning

no code implementations11 Nov 2023 Zongzhao Li, Xiangyu Zhu, Xi Zhang, Zhaoxiang Zhang, Zhen Lei

Specifically, our model contains two key components: the Commonsense-based Contrastive Learning and the Graph Relation Network.

Contrastive Learning Question Answering +4

SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis

1 code implementation29 Nov 2023 Ziqiao Peng, Wentao Hu, Yue Shi, Xiangyu Zhu, Xiaomei Zhang, Hao Zhao, Jun He, Hongyan Liu, Zhaoxin Fan

A lifelike talking head requires synchronized coordination of subject identity, lip movements, facial expressions, and head poses.

Talking Face Generation Talking Head Generation

3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation

no code implementations1 Dec 2023 Zidu Wang, Xiangyu Zhu, Tianshuo Zhang, Baiqin Wang, Zhen Lei

In this paper, we fully utilize the facial part segmentation geometry by introducing Part Re-projection Distance Loss (PRDL).

3D Face Reconstruction Segmentation

SEBERTNets: Sequence Enhanced BERT Networks for Event Entity Extraction Tasks Oriented to the Finance Field

1 code implementation21 Jan 2024 Congqing He, Xiangyu Zhu, Yuquan Le, Yuzhong Liu, Jianhong Yin

In addition, motivated by recommendation system, we propose Hybrid Sequence Enhanced BERT Networks (HSEBERTNets for short), which uses a multi-channel recall method to recall all the corresponding event entity.

Asset Management Event Extraction

DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion Transformer

1 code implementation8 Feb 2024 Zhiyuan Ma, Xiangyu Zhu, GuoJun Qi, Chen Qian, Zhaoxiang Zhang, Zhen Lei

We suspect this is due to a shortage of paired audio-4D data, which is crucial for the Transformer to effectively perform as a denoiser within the Diffusion framework.

Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction

1 code implementation ECCV 2020 Xiangyu Zhu, Fan Yang, Di Huang, Chang Yu, Hao Wang, Jianzhu Guo, Zhen Lei, Stan Z. Li

However, most of their training data is constructed by 3D Morphable Model, whose space spanned is only a small part of the shape space.

3D Face Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.