Search Results for author: Shenghua Gao

Found 55 papers, 33 papers with code

DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers

no code implementations27 Apr 2022 Xianing Chen, Qiong Cao, Yujie Zhong, Jing Zhang, Shenghua Gao, DaCheng Tao

Our DearKD is a two-stage framework that first distills the inductive biases from the early intermediate layers of a CNN and then gives the transformer full play by training without distillation.

Knowledge Distillation

TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting

1 code implementation3 Apr 2022 Huazhang Hu, Sixun Dong, Yiqun Zhao, Dongze Lian, Zhengxin Li, Shenghua Gao

Existing methods focus on performing repetitive action counting in short videos, which is tough for dealing with longer videos in more realistic scenarios.

TaylorImNet for Fast 3D Shape Reconstruction Based on Implicit Surface Function

no code implementations18 Jan 2022 Yuting Xiao, Jiale Xu, Shenghua Gao

After the expansion points and corresponding coefficients are obtained, our model only needs to calculate the Taylor series to evaluate each point and the number of expansion points is independent of the generating resolution.

3D Shape Reconstruction 3D Shape Representation

SVIP: Sequence VerIfication for Procedures in Videos

1 code implementation13 Dec 2021 Yicheng Qian, Weixin Luo, Dongze Lian, Xu Tang, Peilin Zhao, Shenghua Gao

In this paper, we propose a novel sequence verification task that aims to distinguish positive video pairs performing the same action sequence from negative ones with step-level transformations but still conducting the same task.

Action Detection Action Recognition +1

Raw Bayer Pattern Image Synthesis for Computer Vision-oriented Image Signal Processing Pipeline Design

no code implementations25 Oct 2021 Wei Zhou, Xiangyu Zhang, Hongyu Wang, Shenghua Gao, Xin Lou

It is shown that by adding another transformation, the proposed method is able to synthesize high-quality RAW Bayer images with arbitrary size.

Demosaicking Image Generation +2

CTP-Net For Cross-Domain Trajectory Prediction

no code implementations22 Oct 2021 Pingxuan Huang, Yanyan Fang, Bo Hu, Shenghua Gao, Jing Li

Deep learning based trajectory prediction methods rely on large amount of annotated future trajectories, but may not generalize well to a new scenario captured by another camera.

Domain Adaptation Trajectory Prediction

Proxy-bridged Image Reconstruction Network for Anomaly Detection in Medical Images

no code implementations5 Oct 2021 Kang Zhou, Jing Li, Weixin Luo, Zhengxin Li, Jianlong Yang, Huazhu Fu, Jun Cheng, Jiang Liu, Shenghua Gao

To mitigate this problem, in this paper, we propose a novel Proxy-bridged Image Reconstruction Network (ProxyAno) for anomaly detection in medical images.

Anomaly Detection Image Reconstruction

Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates

1 code implementation ICCV 2021 Shenhan Qian, Zhi Tu, YiHao Zhi, Wen Liu, Shenghua Gao

Co-speech gesture generation is to synthesize a gesture sequence that not only looks real but also matches with the input speech audio.

Gesture Generation

AS-MLP: An Axial Shifted MLP Architecture for Vision

2 code implementations ICLR 2022 Dongze Lian, Zehao Yu, Xing Sun, Shenghua Gao

Our proposed AS-MLP obtains 51. 5 mAP on the COCO validation set and 49. 5 MS mIoU on the ADE20K dataset, which is competitive compared to the transformer-based architectures.

Object Detection Semantic Segmentation

Look Before You Leap: Learning Landmark Features for One-Stage Visual Grounding

1 code implementation CVPR 2021 Binbin Huang, Dongze Lian, Weixin Luo, Shenghua Gao

Then we combine the contextual information from the landmark feature convolution module with the target's visual features for grounding.

Visual Grounding

Layout-Guided Novel View Synthesis from a Single Indoor Panorama

1 code implementation CVPR 2021 Jiale Xu, Jia Zheng, Yanyu Xu, Rui Tang, Shenghua Gao

Then, we leverage the room layout prior, a strong structural constraint of the indoor scene, to guide the generation of target views.

Novel View Synthesis

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

1 code implementation CVPR 2021 Zhaoyuan Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao

This paper proposes a framework for the interactive video object segmentation (VOS) in the wild where users can choose some frames for annotations iteratively.

Frame Interactive Video Object Segmentation

Crowd Counting With Partial Annotations in an Image

1 code implementation ICCV 2021 Yanyu Xu, Ziming Zhong, Dongze Lian, Jing Li, Zhengxin Li, Xinxing Xu, Shenghua Gao

To fully leverage the data captured from different scenes with different view angles while reducing the annotation cost, this paper studies a novel crowd counting setting, i. e. only using partial annotations in each image as training data.

Active Learning Crowd Counting

Amodal Segmentation Based on Visible Region Segmentation and Shape Prior

1 code implementation10 Dec 2020 Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua Gao

In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation.

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

3 code implementations18 Nov 2020 Wen Liu, Zhixin Piao, Zhi Tu, Wenhan Luo, Lin Ma, Shenghua Gao

Also, we build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis.

Denoising Image Generation +1

SIRI: Spatial Relation Induced Network For Spatial Description Resolution

no code implementations NeurIPS 2020 Peiyao Wang, Weixin Luo, Yanyu Xu, Haojie Li, Shugong Xu, Jianyu Yang, Shenghua Gao

Spatial Description Resolution, as a language-guided localization task, is proposed for target location in a panoramic street view, given corresponding language descriptions.

Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images

1 code implementation ECCV 2020 Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao

In the end, we further utilize the reconstructed image to extract the structure and measure the difference between structure extracted from original and the reconstructed image.

Anomaly Detection

P$^{2}$Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation

1 code implementation15 Jul 2020 Zehao Yu, Lei Jin, Shenghua Gao

Furthermore, because those textureless regions in indoor scenes (e. g., wall, floor, roof, \etc) usually correspond to planar regions, we propose to leverage superpixels as a plane prior.

Depth Estimation Superpixels

Towards Fast Adaptation of Neural Architectures with Meta Learning

1 code implementation ICLR 2020 Dongze Lian, Yin Zheng, Yintao Xu, Yanxiong Lu, Leyu Lin, Peilin Zhao, Junzhou Huang, Shenghua Gao

Recently, Neural Architecture Search (NAS) has been successfully applied to multiple artificial intelligence areas and shows better performance compared with hand-designed networks.

Few-Shot Learning Neural Architecture Search

Fast-MVSNet: Sparse-to-Dense Multi-View Stereo With Learned Propagation and Gauss-Newton Refinement

1 code implementation CVPR 2020 Zehao Yu, Shenghua Gao

On one hand, the high-resolution depth map, the data-adaptive propagation method and the Gauss-Newton layer jointly guarantee the effectiveness of our method.

14 Depth Estimation

Sparse-GAN: Sparsity-constrained Generative Adversarial Network for Anomaly Detection in Retinal OCT Image

no code implementations28 Nov 2019 Kang Zhou, Shenghua Gao, Jun Cheng, Zaiwang Gu, Huazhu Fu, Zhi Tu, Jianlong Yang, Yitian Zhao, Jiang Liu

With the development of convolutional neural network, deep learning has shown its success for retinal disease detection from optical coherence tomography (OCT) images.

Anomaly Detection

Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

2 code implementations ICCV 2019 Wen Liu, Zhixin Piao, Jie Min, Wenhan Luo, Lin Ma, Shenghua Gao

In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape, which can not only model the joint location and rotation but also characterize the personalized body shape.

Denoising Novel View Synthesis

Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling

1 code implementation ECCV 2020 Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, Zihan Zhou

Recently, there has been growing interest in developing learning-based methods to detect and utilize salient semi-global or global structures, such as junctions, lines, planes, cuboids, smooth surfaces, and all types of symmetries, for 3D scene modeling and understanding.

Room Layout Estimation

Locality-constrained Spatial Transformer Network for Video Crowd Counting

1 code implementation18 Jul 2019 Yanyan Fang, Biyun Zhan, Wandi Cai, Shenghua Gao, Bo Hu

Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame.

Crowd Counting Frame +1

Believe It or Not, We Know What You Are Looking at!

1 code implementation4 Jul 2019 Dongze Lian, Zehao Yu, Shenghua Gao

There are two merits for our two-stage solution based gaze following: i) our solution mimics the behavior of human in gaze following, therefore it is more psychological plausible; ii) besides using heatmap to supervise the output of our network, we can also leverage gaze direction to facilitate the training of gaze direction pathway, therefore our network can be more robustly trained.

Learning Semantics-aware Distance Map with Semantics Layering Network for Amodal Instance Segmentation

1 code implementation30 May 2019 Ziheng Zhang, Anpei Chen, Ling Xie, Jingyi Yu, Shenghua Gao

Specifically, we first introduce a new representation, namely a semantics-aware distance map (sem-dist map), to serve as our target for amodal segmentation instead of the commonly used masks and heatmaps.

Amodal Instance Segmentation Semantic Segmentation

Domain Adversarial Reinforcement Learning for Partial Domain Adaptation

no code implementations10 May 2019 Jin Chen, Xinxiao wu, Lixin Duan, Shenghua Gao

In this more general and practical scenario, a major challenge is how to select source instances in the shared classes across different domains for positive transfer.

Partial Domain Adaptation Q-Learning +1

Generic Multiview Visual Tracking

no code implementations4 Apr 2019 Minye Wu, Haibin Ling, Ning Bi, Shenghua Gao, Hao Sheng, Jingyi Yu

A natural solution to these challenges is to use multiple cameras with multiview inputs, though existing systems are mostly limited to specific targets (e. g. human), static cameras, and/or camera calibration.

Camera Calibration Trajectory Prediction +1

CE-Net: Context Encoder Network for 2D Medical Image Segmentation

2 code implementations7 Mar 2019 Zaiwang Gu, Jun Cheng, Huazhu Fu, Kang Zhou, Huaying Hao, Yitian Zhao, Tianyang Zhang, Shenghua Gao, Jiang Liu

In this paper, we propose a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation.

Cell Segmentation Optic Disc Segmentation +1

Deep Surface Light Fields

no code implementations15 Oct 2018 Anpei Chen, Minye Wu, Yingliang Zhang, Nianyi Li, Jie Lu, Shenghua Gao, Jingyi Yu

A surface light field represents the radiance of rays originating from any points on the surface in any directions.

Data Compression Image Registration

Evaluating Capability of Deep Neural Networks for Image Classification via Information Plane

no code implementations ECCV 2018 Hao Cheng, Dongze Lian, Shenghua Gao, Yanlin Geng

Inspired by the pioneering work of information bottleneck principle for Deep Neural Networks (DNNs) analysis, we design an information plane based framework to evaluate the capability of DNNs for image classification tasks, which not only helps understand the capability of DNNs, but also helps us choose a neural network which leads to higher classification accuracy more efficiently.

General Classification Image Classification +1

Multi-Cell Multi-Task Convolutional Neural Networks for Diabetic Retinopathy Grading

no code implementations31 Aug 2018 Kang Zhou, Zaiwang Gu, Wen Liu, Weixin Luo, Jun Cheng, Shenghua Gao, Jiang Liu

To considering the relationships of images with different stages, we propose a \textbf{Multi-Task} learning strategy which predicts the label with both classification and regression.

Diabetic Retinopathy Grading General Classification +1

Future Frame Prediction for Anomaly Detection – A New Baseline

1 code implementation CVPR 2018 Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao

To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

Anomaly Detection Frame +2

Encoding Crowd Interaction With Deep Neural Network for Pedestrian Trajectory Prediction

1 code implementation CVPR 2018 Yanyu Xu, Zhixin Piao, Shenghua Gao

Specifically, motivated by the residual learning in deep learning, we propose to predict displacement between neighboring frames for each pedestrian sequentially.

Pedestrian Trajectory Prediction Trajectory Prediction

Face Aging With Identity-Preserved Conditional Generative Adversarial Networks

2 code implementations CVPR 2018 Zongwei Wang, Xu Tang, Weixin Luo, Shenghua Gao

By grouping faces with target age together, the objective of face aging is equivalent to transferring aging patterns of faces within the target age group to the face whose aged face is to be synthesized.

Gaze Prediction in Dynamic 360° Immersive Videos

no code implementations CVPR 2018 Yanyu Xu, Yanbing Dong, Junru Wu, Zhengzhong Sun, Zhiru Shi, Jingyi Yu, Shenghua Gao

This paper explores gaze prediction in dynamic $360^circ$ immersive videos, emph{i. e.}, based on the history scan path and VR contents, we predict where a viewer will look at an upcoming time.

Gaze Prediction

Future Frame Prediction for Anomaly Detection -- A New Baseline

1 code implementation28 Dec 2017 Wen Liu, Weixin Luo, Dongze Lian, Shenghua Gao

To predict a future frame with higher quality for normal events, other than the commonly used appearance (spatial) constraints on intensity and gradient, we also introduce a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and this is the first work that introduces a temporal constraint into the video prediction task.

Anomaly Detection Frame +2

Personalized Saliency and its Prediction

1 code implementation9 Oct 2017 Yanyu Xu, Shenghua Gao, Junru Wu, Nianyi Li, Jingyi Yu

Specifically, we propose to decompose a personalized saliency map (referred to as PSM) into a universal saliency map (referred to as USM) predictable by existing saliency detection models and a new discrepancy map across users that characterizes personalized saliency.

Saliency Detection

A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework

1 code implementation ICCV 2017 Weixin Luo, Wen Liu, Shenghua Gao

Motivated by the capability of sparse coding based anomaly detection, we propose a Temporally-coherent Sparse Coding (TSC) where we enforce similar neighbouring frames be encoded with similar reconstruction coefficients.

Anomaly Detection

Graph Construction with Label Information for Semi-Supervised Learning

no code implementations8 Jul 2016 Liansheng Zhuang, Zihan Zhou, Jingwen Yin, Shenghua Gao, Zhouchen Lin, Yi Ma, Nenghai Yu

In the literature, most existing graph-based semi-supervised learning (SSL) methods only use the label information of observed samples in the label propagation stage, while ignoring such valuable information when learning the graph.

graph construction Graph Learning

Progressively Parsing Interactional Objects for Fine Grained Action Detection

no code implementations CVPR 2016 Bingbing Ni, Xiaokang Yang, Shenghua Gao

Fine grained video action analysis often requires reliable detection and tracking of various interacting objects and human body parts, denoted as interactional object parsing.

Action Analysis Action Recognition +3

Single-Image Crowd Counting via Multi-Column Convolutional Neural Network

3 code implementations Conference 2016 Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, Yi Ma

To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map.

Crowd Counting

Constructing a Non-Negative Low Rank and Sparse Graph with Data-Adaptive Features

no code implementations3 Sep 2014 Liansheng Zhuang, Shenghua Gao, Jinhui Tang, Jingjing Wang, Zhouchen Lin, Yi Ma

This paper aims at constructing a good graph for discovering intrinsic data structures in a semi-supervised learning setting.

graph construction

PCANet: A Simple Deep Learning Baseline for Image Classification?

2 code implementations14 Apr 2014 Tsung-Han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, Yi Ma

In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms.

Classification Face Recognition +5

Learning by Associating Ambiguously Labeled Images

no code implementations CVPR 2013 Zinan Zeng, Shijie Xiao, Kui Jia, Tsung-Han Chan, Shenghua Gao, Dong Xu, Yi Ma

Our framework is motivated by the observation that samples from the same class repetitively appear in the collection of ambiguously labeled training images, while they are just ambiguously labeled in each image.

Cannot find the paper you are looking for? You can Submit a new open access paper.