Search Results for author: Shengfeng He

Found 40 papers, 17 papers with code

A Simple Data Mixing Prior for Improving Self-Supervised Learning

1 code implementation CVPR 2022 Sucheng Ren, Huiyu Wang, Zhengqi Gao, Shengfeng He, Alan Yuille, Yuyin Zhou, Cihang Xie

More notably, our SDMP is the first method that successfully leverages data mixing to improve (rather than hurt) the performance of Vision Transformers in the self-supervised setting.

Representation Learning Self-Supervised Learning

Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting

no code implementations29 May 2022 Zheng Xiong, Liangyu Chai, Wenxi Liu, Yongtuo Liu, Sucheng Ren, Shengfeng He

To enable training under this new setting, we convert the crowd count regression problem to a ranking potential prediction problem.

Crowd Counting Learning-To-Rank

High-resolution Face Swapping via Latent Semantics Disentanglement

1 code implementation CVPR 2022 Yangyang Xu, Bailin Deng, Junle Wang, Yanqing Jing, Jia Pan, Shengfeng He

Although previous research can leverage generative priors to produce high-resolution results, their quality can suffer from the entangled semantics of the latent space.

Disentanglement Face Swapping

Faithful Extreme Rescaling via Generative Prior Reciprocated Invertible Representations

1 code implementation CVPR 2022 Zhixuan Zhong, Liangyu Chai, Yang Zhou, Bailin Deng, Jia Pan, Shengfeng He

This paper presents a Generative prior ReciprocAted Invertible rescaling Network (GRAIN) for generating faithful high-resolution (HR) images from low-resolution (LR) invertible images with an extreme upscaling factor (64x).

Shunted Self-Attention via Multi-Scale Token Aggregation

1 code implementation CVPR 2022 Sucheng Ren, Daquan Zhou, Shengfeng He, Jiashi Feng, Xinchao Wang

This novel merging scheme enables the self-attention to learn relationships between objects with different sizes and simultaneously reduces the token numbers and the computational cost.

Computer Vision

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation

no code implementations6 Aug 2021 Yongtuo Liu, Dan Xu, Sucheng Ren, Hanjie Wu, Hongmin Cai, Shengfeng He

We further leverage the derived segments to propose a crowd-aware fine-grained domain adaptation framework for crowd counting, which consists of two novel adaptation modules, i. e., Crowd Region Transfer (CRT) and Crowd Density Alignment (CDA).

Crowd Counting Domain Adaptation +1

Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting

no code implementations6 Aug 2021 Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Jing Qin, Dan Xu, Shengfeng He

In this way, we can transfer the original spatial labeling redundancy caused by individual similarities to effective supervision signals on the unlabeled regions.

Crowd Counting

Unifying Global-Local Representations in Salient Object Detection with Transformer

1 code implementation5 Aug 2021 Sucheng Ren, Qiang Wen, Nanxuan Zhao, Guoqiang Han, Shengfeng He

In this paper, we introduce a new attention-based encoder, vision transformer, into salient object detection to ensure the globalization of the representations from shallow to deep layers.

object-detection Object Detection +1

From Continuity to Editability: Inverting GANs with Consecutive Images

2 code implementations ICCV 2021 Yangyang Xu, Yong Du, Wenpeng Xiao, Xuemiao Xu, Shengfeng He

This inborn property is used for two unique purposes: 1) regularizing the joint inversion process, such that each of the inverted code is semantically accessible from one of the other and fastened in a editable domain; 2) enforcing inter-image coherence, such that the fidelity of each inverted code can be maximized with the complement of other images.

Co-advise: Cross Inductive Bias Distillation

no code implementations CVPR 2022 Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks.

Inductive Bias Natural Language Processing

Reciprocal Transformations for Unsupervised Video Object Segmentation

1 code implementation CVPR 2021 Sucheng Ren, Wenxi Liu, Yongtuo Liu, Haoxin Chen, Guoqiang Han, Shengfeng He

Additionally, to exclude the information of the moving background objects from motion features, our transformation module enables to reciprocally transform the appearance features to enhance the motion features, so as to focus on the moving objects with salient appearance while removing the co-moving outliers.

Ranked #3 on Unsupervised Video Object Segmentation on DAVIS 2016 (using extra training data)

Optical Flow Estimation Semantic Segmentation +2

Learning From the Master: Distilling Cross-Modal Advanced Knowledge for Lip Reading

no code implementations CVPR 2021 Sucheng Ren, Yong Du, Jianming Lv, Guoqiang Han, Shengfeng He

To these ends, we introduce a trainable "master" network which ingests both audio signals and silent lip videos instead of a pretrained teacher.

Lip Reading speech-recognition +1

Spatially-Invariant Style-Codes Controlled Makeup Transfer

1 code implementation CVPR 2021 Han Deng, Chu Han, Hongmin Cai, Guoqiang Han, Shengfeng He

In this paper, we take a different perspective to break down the makeup transfer problem into a two-step extraction-assignment process.

Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes

1 code implementation CVPR 2021 Huiting Yang, Liangyu Chai, Qiang Wen, Shuang Zhao, Zixun Sun, Shengfeng He

In this way, arbitrary attributes can be edited by collecting positive data only, and the proposed method learns a controllable representation enabling manipulation of non-binary attributes like anime styles and facial characteristics.

Smart Scribbles for Image Mating

no code implementations31 Mar 2021 Xin Yang, Yu Qiao, Shaozhe Chen, Shengfeng He, BaoCai Yin, Qiang Zhang, Xiaopeng Wei, Rynson W. H. Lau

Image matting is an ill-posed problem that usually requires additional user input, such as trimaps or scribbles.

Image Matting

Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics

2 code implementations31 Aug 2020 Jiangliu Wang, Jianbo Jiao, Linchao Bao, Shengfeng He, Wei Liu, Yun-hui Liu

Specifically, given an unlabeled video clip, we compute a series of spatio-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion, the spatial location and dominant color of the largest color diversity along the temporal axis, etc.

Action Recognition Representation Learning +2

TENet: Triple Excitation Network for Video Salient Object Detection

no code implementations ECCV 2020 Sucheng Ren, Chu Han, Xin Yang, Guoqiang Han, Shengfeng He

In this paper, we propose a simple yet effective approach, named Triple Excitation Network, to reinforce the training of video salient object detection (VSOD) from three aspects, spatial, temporal, and online excitations.

object-detection Salient Object Detection +1

Over-crowdedness Alert! Forecasting the Future Crowd Distribution

no code implementations9 Jun 2020 Yuzhen Niu, Weifeng Shi, Wenxi Liu, Shengfeng He, Jia Pan, Antoni B. Chan

In this paper, we formulate a novel crowd analysis problem, in which we aim to predict the crowd distribution in the near future given sequential frames of a crowd video without any identity annotations.

Context-aware and Scale-insensitive Temporal Repetition Counting

1 code implementation CVPR 2020 Huaidong Zhang, Xuemiao Xu, Guoqiang Han, Shengfeng He

It avoids the heavy computation of exhaustively searching all the cycle lengths in the video, and, instead, it propagates the coarse prediction for further refinement in a hierarchical manner.

Laplacian Denoising Autoencoder

no code implementations30 Mar 2020 Jianbo Jiao, Linchao Bao, Yunchao Wei, Shengfeng He, Honghui Shi, Rynson Lau, Thomas S. Huang

This can be naturally generalized to span multiple scales with a Laplacian pyramid representation of the input data.

Denoising Self-Supervised Learning

Visualizing the Invisible: Occluded Vehicle Segmentation and Recovery

no code implementations ICCV 2019 Xiaosheng Yan, Yuanlong Yu, Feigege Wang, Wenxi Liu, Shengfeng He, Jia Pan

We conduct comparison experiments on this dataset and demonstrate that our model outperforms the state-of-the-art in tasks of recovering segmentation mask and appearance for occluded vehicles.

Active Matting

no code implementations NeurIPS 2018 Xin Yang, Ke Xu, Shaozhe Chen, Shengfeng He, Baocai Yin Yin, Rynson Lau

Our aim is to discover the most informative sequence of regions for user input in order to produce a good alpha matte with minimum labeling efforts.

Image Matting

Joint Face Hallucination and Deblurring via Structure Generation and Detail Enhancement

no code implementations22 Nov 2018 Yibing Song, Jiawei Zhang, Lijun Gong, Shengfeng He, Linchao Bao, Jinshan Pan, Qingxiong Yang, Ming-Hsuan Yang

We first propose a facial component guided deep Convolutional Neural Network (CNN) to restore a coarse face image, which is denoted as the base image where the facial component is automatically generated from the input face image.

Deblurring Face Hallucination +1

Deformable Object Tracking with Gated Fusion

no code implementations27 Sep 2018 Wenxi Liu, Yibing Song, Dengsheng Chen, Shengfeng He, Yuanlong Yu, Tao Yan, Gerhard P. Hancke, Rynson W. H. Lau

In addition, we also propose a gated fusion scheme to control how the variations captured by the deformable convolution affect the original appearance.

Object Tracking

SINet: A Scale-insensitive Convolutional Neural Network for Fast Vehicle Detection

no code implementations2 Apr 2018 Xiaowei Hu, Xuemiao Xu, Yongjie Xiao, Hao Chen, Shengfeng He, Jing Qin, Pheng-Ann Heng

Based on these findings, we present a scale-insensitive convolutional neural network (SINet) for fast detecting vehicles with a large variance of scales.

Fast Vehicle Detection object-detection +1

Egocentric Hand Detection Via Dynamic Region Growing

no code implementations10 Nov 2017 Shao Huang, Weiqiang Wang, Shengfeng He, Rynson W. H. Lau

Egocentric videos, which mainly record the activities carried out by the users of the wearable cameras, have drawn much research attentions in recent years.

Action Recognition Gesture Recognition +2

Delving Into Salient Object Subitizing and Detection

no code implementations ICCV 2017 Shengfeng He, Jianbo Jiao, Xiaodan Zhang, Guoqiang Han, Rynson W. H. Lau

Experiments show that the proposed multi-task network outperforms existing multi-task architectures, and the auxiliary subitizing network provides strong guidance to salient object detection by reducing false positives and producing coherent saliency maps.

object-detection RGB Salient Object Detection +1

Stylizing Face Images via Multiple Exemplars

no code implementations28 Aug 2017 Yibing Song, Linchao Bao, Shengfeng He, Qingxiong Yang, Ming-Hsuan Yang

We address the problem of transferring the style of a headshot photo to face images.

RGBD Salient Object Detection via Deep Fusion

no code implementations12 Jul 2016 Liangqiong Qu, Shengfeng He, Jiawei Zhang, Jiandong Tian, Yandong Tang, Qingxiong Yang

Numerous efforts have been made to design different low level saliency cues for the RGBD saliency detection, such as color or depth contrast features, background and color compactness priors.

object-detection RGB-D Salient Object Detection +3

Exemplar-Driven Top-Down Saliency Detection via Deep Association

no code implementations CVPR 2016 Shengfeng He, Rynson W. H. Lau, Qingxiong Yang

To address it, we design a two-stage deep model to learn the intra-class association between the exemplars and query objects.

Saliency Detection

Oriented Object Proposals

no code implementations ICCV 2015 Shengfeng He, Rynson W. H. Lau

In this paper, we propose a new approach to generate oriented object proposals (OOPs) to reduce the detection error caused by various orientations of the object.

Visual Tracking via Locality Sensitive Histograms

no code implementations CVPR 2013 Shengfeng He, Qingxiong Yang, Rynson W. H. Lau, Jiang Wang, Ming-Hsuan Yang

A robust tracking framework based on the locality sensitive histograms is proposed, which consists of two main components: a new feature for tracking that is robust to illumination changes and a novel multi-region tracking algorithm that runs in realtime even with hundreds of regions.

Visual Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.