Search Results for author: Wen Gao

Found 78 papers, 24 papers with code

STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction

1 code implementation9 Jun 2022 Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

To solve the information loss problem, the proposed model aims to preserve the spatiotemporal information for videos during the feature extraction and the state transitions, respectively.

Video Prediction

Learning Weighting Map for Bit-Depth Expansion within a Rational Range

1 code implementation26 Apr 2022 Yuqing Liu, Qi Jia, Jian Zhang, Xin Fan, Shanshe Wang, Siwei Ma, Wen Gao

Existing BDE methods have no unified solution for various BDE situations, and directly learn a mapping for each pixel from LBD image to the desired value in HBD image, which may change the given high-order bits and lead to a huge deviation from the ground truth.

SSIM

STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond

no code implementations20 Apr 2022 Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos.

Action Recognition object-detection +2

Gradient Correction beyond Gradient Descent

no code implementations16 Mar 2022 Zefan Li, Bingbing Ni, Teng Li, Wenjun Zhang, Wen Gao

GCGD consists of two plug-in modules: 1) inspired by the idea of gradient prediction, we propose a \textbf{GC-W} module for weight gradient correction; 2) based on Neural ODE, we propose a \textbf{GC-ODE} module for hidden states gradient correction.

Cross-SRN: Structure-Preserving Super-Resolution Network with Cross Convolution

no code implementations5 Jan 2022 Yuqing Liu, Qi Jia, Xin Fan, Shanshe Wang, Siwei Ma, Wen Gao

It is challenging to restore low-resolution (LR) images to super-resolution (SR) images with correct and clear details.

Super-Resolution

Instance-Aware Dynamic Neural Network Quantization

1 code implementation CVPR 2022 Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao

However, natural images are of huge diversity with abundant content and using such a universal quantization configuration for all samples is not an optimal strategy.

Quantization

Towards End-to-End Image Compression and Analysis with Transformers

1 code implementation17 Dec 2021 Yuanchao Bai, Xu Yang, Xianming Liu, Junjun Jiang, YaoWei Wang, Xiangyang Ji, Wen Gao

Meanwhile, we propose a feature aggregation module to fuse the compressed features with the selected intermediate features of the Transformer, and feed the aggregated features to a deconvolutional neural network for image reconstruction.

Classification Image Classification +3

MAU: A Motion-Aware Unit for Video Prediction and Beyond

1 code implementation NeurIPS 2021 Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xiang Xinguang, Wen Gao

The attention module aims to learn an attention map based on the correlations between the current spatial state and the historical spatial states.

Action Recognition Video Prediction

Improving Robustness and Accuracy via Relative Information Encoding in 3D Human Pose Estimation

1 code implementation29 Jul 2021 Wenkang Shan, Haopeng Lu, Shanshe Wang, Xinfeng Zhang, Wen Gao

To alleviate these two problems, we propose a relative information encoding method that yields positional and temporal enhanced representations.

Monocular 3D Human Pose Estimation

Post-Training Quantization for Vision Transformer

no code implementations NeurIPS 2021 Zhenhua Liu, Yunhe Wang, Kai Han, Siwei Ma, Wen Gao

Recently, transformer has achieved remarkable performance on a variety of computer vision applications.

Quantization

Rate Distortion Characteristic Modeling for Neural Image Compression

no code implementations24 Jun 2021 Chuanmin Jia, Ziqing Ge, Shanshe Wang, Siwei Ma, Wen Gao

End-to-end optimized neural image compression (NIC) has obtained superior lossy compression performance recently.

Image Compression

Progressive Stage-wise Learning for Unsupervised Feature Representation Enhancement

no code implementations CVPR 2021 Zefan Li, Chenxi Liu, Alan Yuille, Bingbing Ni, Wenjun Zhang, Wen Gao

For a given unsupervised task, we design multilevel tasks and define different learning stages for the deep network.

Recent Standard Development Activities on Video Coding for Machines

no code implementations26 May 2021 Wen Gao, Shan Liu, Xiaozhong Xu, Manouchehr Rafie, Yuan Zhang, Igor Curcio

Specifically, we will first provide an overview of the MPEG VCM group including use cases, requirements, processing pipelines, plan for potential VCM standards, followed by the evaluation framework including machine-vision tasks, dataset, evaluation metrics, and anchor generation.

object-detection Object Detection

Intrinsic Temporal Regularization for High-resolution Human Video Synthesis

no code implementations11 Dec 2020 Lingbo Yang, Zhanning Gao, Peiran Ren, Siwei Ma, Wen Gao

Temporal consistency is crucial for extending image processing pipelines to the video domain, which is often enforced with flow-based warping error over adjacent frames.

Motion Estimation

Pre-Trained Image Processing Transformer

3 code implementations CVPR 2021 Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao

To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.

 Ranked #1 on Single Image Deraining on Rain100L (using extra training data)

Color Image Denoising Contrastive Learning +2

Implicit Subspace Prior Learning for Dual-Blind Face Restoration

1 code implementation12 Oct 2020 Lingbo Yang, Pan Wang, Zhanning Gao, Shanshe Wang, Peiran Ren, Siwei Ma, Wen Gao

Face restoration is an inherently ill-posed problem, where additional prior constraints are typically considered crucial for mitigating such pathology.

Blind Face Restoration

Progressive Multi-Scale Residual Network for Single Image Super-Resolution

no code implementations19 Jul 2020 Yuqing Liu, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

However, recent multi-scale networks usually aim to build the hierarchical exploration with different sizes of filters, which lead to high computation complexity costs, and seldom focus on the inherent correlations among different scales.

Image Restoration Image Super-Resolution +2

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

no code implementations26 May 2020 Lingbo Yang, Pan Wang, Chang Liu, Zhanning Gao, Peiran Ren, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Xian-Sheng Hua, Wen Gao

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality.

Pose Transfer

Iterative Network for Image Super-Resolution

1 code implementation20 May 2020 Yuqing Liu, Shiqi Wang, Jian Zhang, Shanshe Wang, Siwei Ma, Wen Gao

A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.

Image Super-Resolution Single Image Super Resolution +1

HiFaceGAN: Face Renovation via Collaborative Suppression and Replenishment

1 code implementation11 May 2020 Lingbo Yang, Chang Liu, Pan Wang, Shanshe Wang, Peiran Ren, Siwei Ma, Wen Gao

Existing face restoration researches typically relies on either the degradation prior or explicit guidance labels for training, which often results in limited generalization ability over real-world images with heterogeneous degradations and rich background contents.

Blind Face Restoration Face Hallucination +3

Segatron: Segment-Aware Transformer for Language Modeling and Understanding

1 code implementation30 Apr 2020 He Bai, Peng Shi, Jimmy Lin, Yuqing Xie, Luchen Tan, Kun Xiong, Wen Gao, Ming Li

To verify this, we propose a segment-aware Transformer (Segatron), by replacing the original token position encoding with a combined position encoding of paragraph, sentence, and token.

Language Modelling Masked Language Modeling +1

Towards Analysis-friendly Face Representation with Scalable Feature and Texture Compression

no code implementations21 Apr 2020 Shurun Wang, Shiqi Wang, Wenhan Yang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In particular, we study the feature and texture compression in a scalable coding framework, where the base layer serves as the deep learning feature and enhancement layer targets to perfectly reconstruct the texture.

Image Compression

Universal Adversarial Perturbations Generative Network for Speaker Recognition

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Attacking deep learning based biometric systems has drawn more and more attention with the wide deployment of fingerprint/face/speaker recognition systems, given the fact that the neural networks are vulnerable to the adversarial examples, which have been intentionally perturbed to remain almost imperceptible for human.

Speaker Recognition

Learning to fool the speaker recognition

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

Due to the widespread deployment of fingerprint/face/speaker recognition systems, attacking deep learning based biometric systems has drawn more and more attention.

Audio and Speech Processing Cryptography and Security Sound

Direct Speech-to-image Translation

1 code implementation7 Apr 2020 Jiguo Li, Xinfeng Zhang, Chuanmin Jia, Jizheng Xu, Li Zhang, Yue Wang, Siwei Ma, Wen Gao

In this paper, we attempt to translate the speech signals into the image signals without the transcription stage.

Multimedia Sound Audio and Speech Processing

Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant Disease Diagnosis

no code implementations17 Mar 2020 Ruifeng Shi, Deming Zhai, Xian-Ming Liu, Junjun Jiang, Wen Gao

However, the performance of CNN-based classification approach depends on a large amount of high-quality manually labeled training data, which are inevitably introduced noise on labels in practice, leading to model overfitting and performance degradation.

General Classification Image Classification +1

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

no code implementations10 Jan 2020 Ling-Yu Duan, Jiaying Liu, Wenhan Yang, Tiejun Huang, Wen Gao

Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications.

Video Compression

Knowledge Transfer via Student-Teacher Collaboration

no code implementations25 Sep 2019 Tianxiao Gao, Ruiqin Xiong, Zhenhua Liu, Siwei Ma, Feng Wu, Tiejun Huang, Wen Gao

One way to compress these heavy models is knowledge transfer (KT), in which a light student network is trained through absorbing the knowledge from a powerful teacher network.

Transfer Learning

Global-Local Temporal Representations For Video Person Re-Identification

no code implementations ICCV 2019 Jianing Li, Jingdong Wang, Qi Tian, Wen Gao, Shiliang Zhang

The long-term relations are captured by a temporal self-attention model to alleviate the occlusions and noises in video sequences.

Metric Learning Re-Ranking +1

Towards Digital Retina in Smart Cities: A Model Generation, Utilization and Communication Paradigm

1 code implementation31 Jul 2019 Yihang Lou, Ling-Yu Duan, Yong Luo, Ziqian Chen, Tongliang Liu, Shiqi Wang, Wen Gao

The digital retina in smart cities is to select what the City Eye tells the City Brain, and convert the acquired visual data from front-end visual sensors to features in an intelligent sensing manner.

Single Image Blind Deblurring Using Multi-Scale Latent Structure Prior

no code implementations11 Jun 2019 Yuanchao Bai, Huizhu Jia, Ming Jiang, Xian-Ming Liu, Xiaodong Xie, Wen Gao

Blind image deblurring is a challenging problem in computer vision, which aims to restore both the blur kernel and the latent sharp image from only a blurry observation.

Blind Image Deblurring Computer Vision +4

Masked Non-Autoregressive Image Captioning

no code implementations3 Jun 2019 Junlong Gao, Xi Meng, Shiqi Wang, Xia Li, Shanshe Wang, Siwei Ma, Wen Gao

Existing captioning models often adopt the encoder-decoder architecture, where the decoder uses autoregressive decoding to generate captions, such that each token is generated sequentially given the preceding generated tokens.

Image Captioning Machine Translation +1

Self-critical n-step Training for Image Captioning

no code implementations CVPR 2019 Junlong Gao, Shiqi Wang, Shanshe Wang, Siwei Ma, Wen Gao

Existing methods for image captioning are usually trained by cross entropy loss, which leads to exposure bias and the inconsistency between the optimizing function and evaluation metrics.

Image Captioning

Scalable Facial Image Compression with Deep Feature Reconstruction

no code implementations14 Mar 2019 Shurun Wang, Shiqi Wang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Wen Gao

In this paper, we propose a scalable image compression scheme, including the base layer for feature representation and enhancement layer for texture representation.

Image Compression

Attention Driven Person Re-identification

no code implementations13 Oct 2018 Fan Yang, Ke Yan, Shijian Lu, Huizhu Jia, Xiaodong Xie, Wen Gao

Person re-identification (ReID) is a challenging task due to arbitrary human pose variations, background clutters, etc.

Person Re-Identification

Computed Tomography Image Enhancement using 3D Convolutional Neural Network

no code implementations18 Jul 2018 Meng Li, Shiwen Shen, Wen Gao, William Hsu, Jason Cong

Computed tomography (CT) is increasingly being used for cancer screening, such as early detection of lung cancer.

Computed Tomography (CT) Image Enhancement +1

RAM: A Region-Aware Deep Model for Vehicle Re-Identification

no code implementations25 Jun 2018 Xiaobin Liu, Shiliang Zhang, Qingming Huang, Wen Gao

Specifically, in addition to extracting global features, RAM also extracts features from a series of local regions.

Vehicle Re-Identification

Depth-Aware Stereo Video Retargeting

no code implementations CVPR 2018 Bing Li, Chia-Wen Lin, Boxin Shi, Tiejun Huang, Wen Gao, C. -C. Jay Kuo

As compared with traditional video retargeting, stereo video retargeting poses new challenges because stereo video contains the depth information of salient objects and its time dynamics.

Graph-Based Blind Image Deblurring From a Single Photograph

no code implementations22 Feb 2018 Yuanchao Bai, Gene Cheung, Xian-Ming Liu, Wen Gao

We leverage the new graph spectral interpretation for RGTV to design an efficient algorithm that solves for the skeleton image and the blur kernel alternately.

Blind Image Deblurring Image Deblurring

Blind Image Deblurring via Reweighted Graph Total Variation

no code implementations24 Dec 2017 Yuanchao Bai, Gene Cheung, Xian-Ming Liu, Wen Gao

The problem can be solved in two parts: i) estimate a blur kernel from the blurry image, and ii) given estimated blur kernel, de-convolve blurry input to restore the target image.

Blind Image Deblurring Image Deblurring

LVreID: Person Re-Identification with Long Sequence Videos

no code implementations20 Dec 2017 Jianing Li, Shiliang Zhang, Jingdong Wang, Wen Gao, Qi Tian

This paper mainly establishes a large-scale Long sequence Video database for person re-IDentification (LVreID).

Person Re-Identification

AI Oriented Large-Scale Video Management for Smart City: Technologies, Standards and Beyond

no code implementations5 Dec 2017 Ling-Yu Duan, Yihang Lou, Shiqi Wang, Wen Gao, Yong Rui

To practically facilitate deep neural network models in the large-scale video analysis, there are still unprecedented challenges for the large-scale video data management.

Computer Vision

Person Transfer GAN to Bridge Domain Gap for Person Re-Identification

21 code implementations CVPR 2018 Longhui Wei, Shiliang Zhang, Wen Gao, Qi Tian

Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not been fully investigated, e. g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network.

Person Re-Identification Unsupervised Domain Adaptation

A Bio-Inspired Multi-Exposure Fusion Framework for Low-light Image Enhancement

no code implementations2 Nov 2017 Zhenqiang Ying, Ge Li, Wen Gao

Inspired by human visual system, we design a multi-exposure fusion framework for low-light image enhancement.

Computer Vision Low-Light Image Enhancement

Pose-driven Deep Convolutional Model for Person Re-identification

no code implementations ICCV 2017 Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian

Our deep architecture explicitly leverages the human part cues to alleviate the pose variations and learn robust feature representations from both the global image and different local parts.

Person Re-Identification

GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval

no code implementations13 Sep 2017 Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, Qi Tian

Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively.

Person Re-Identification Representation Learning

Beyond Monte Carlo Tree Search: Playing Go with Deep Alternative Neural Network and Long-Term Evaluation

no code implementations13 Jun 2017 Jinzhuo Wang, Wenmin Wang, Ronggang Wang, Wen Gao

We show such setting can preserve more contexts of local features and its evolutions which are beneficial for move prediction.

Compact Descriptors for Video Analysis: the Emerging MPEG Standard

no code implementations26 Apr 2017 Ling-Yu Duan, Vijay Chandrasekhar, Shiqi Wang, Yihang Lou, Jie Lin, Yan Bai, Tiejun Huang, Alex ChiChung Kot, Wen Gao

This paper provides an overview of the on-going compact descriptors for video analysis standard (CDVA) from the ISO/IEC moving pictures experts group (MPEG).

Correlation Preserving Sparse Coding Over Multi-level Dictionaries for Image Denoising

no code implementations23 Dec 2016 Rui Chen, Huizhu Jia, Xiaodong Xie, Wen Gao

In this letter, we propose a novel image denoising method based on correlation preserving sparse coding.

Image Denoising

An Attention-Driven Approach of No-Reference Image Quality Assessment

no code implementations12 Dec 2016 Diqi Chen, Yizhou Wang, Tianfu Wu, Wen Gao

The model learning is implemented by a reinforcement strategy, in which the rewards of both tasks guide the learning of the optimal sampling policy to acquire the "task-informative" image regions so that the predictions can be made accurately and efficiently (in terms of the sampling steps).

Multi-Task Learning No-Reference Image Quality Assessment +1

Globally Variance-Constrained Sparse Representation and Its Application in Image Set Coding

no code implementations17 Aug 2016 Xiang Zhang, Jiarui Sun, Siwei Ma, Zhouchen Lin, Jian Zhang, Shiqi Wang, Wen Gao

Therefore, introducing an accurate rate-constraint in sparse coding and dictionary learning becomes meaningful, which has not been fully exploited in the context of sparse representation.

Data Compression Dictionary Learning

Deep Attributes Driven Multi-Camera Person Re-identification

no code implementations11 May 2016 Chi Su, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian

And we propose a semi-supervised attribute learning framework which progressively boosts the accuracy of attributes only using a limited number of labeled data.

Metric Learning Person Re-Identification

Maximal Sparsity with Deep Networks?

no code implementations NeurIPS 2016 Bo Xin, Yizhou Wang, Wen Gao, David Wipf

The iterations of many sparse estimation algorithms are comprised of a fixed linear filter cascaded with a thresholding nonlinearity, which collectively resemble a typical neural network layer.

Multi-Task Learning With Low Rank Attribute Embedding for Person Re-Identification

no code implementations ICCV 2015 Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry S. Davis, Wen Gao

Since attributes are generally correlated, we introduce a low rank attribute embedding into the MTL formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered to better describe people.

Multi-Task Learning Person Re-Identification

Cross-pose Face Recognition by Canonical Correlation Analysis

no code implementations29 Jul 2015 Annan Li, Shiguang Shan, Xilin Chen, Bingpeng Ma, Shuicheng Yan, Wen Gao

We argue that one of the diffculties in this problem is the severe misalignment in face images or feature vectors with different poses.

Face Recognition

Image Denoising via Adaptive Soft-Thresholding Based on Non-Local Samples

no code implementations CVPR 2015 Hangfan Liu, Ruiqin Xiong, Jian Zhang, Wen Gao

To estimate the expectation and variance parameters for the transform bands of a particular patch, we exploit the non-local correlation of image and collect a set of similar patches as data samples to form the distribution.

Image Denoising

Stable Feature Selection from Brain sMRI

no code implementations25 Mar 2015 Bo Xin, Lingjing Hu, Yizhou Wang, Wen Gao

Neuroimage analysis usually involves learning thousands or even millions of variables using only a limited number of samples.

feature selection

Robust Estimation of 3D Human Poses from a Single Image

no code implementations CVPR 2014 Chunyu Wang, Yizhou Wang, Zhouchen Lin, Alan L. Yuille, Wen Gao

We address the challenges in three ways: (i) We represent a 3D pose as a linear combination of a sparse set of bases learned from 3D human skeletons.

3D Human Pose Estimation 3D Pose Estimation +1

Group-based Sparse Representation for Image Restoration

1 code implementation14 May 2014 Jian Zhang, Debin Zhao, Wen Gao

In this paper, instead of using patch as the basic unit of sparse representation, we exploit the concept of group as the basic unit of sparse representation, which is composed of nonlocal patches with similar structures, and establish a novel sparse representation modeling of natural images, called group-based sparse representation (GSR).

Compressive Sensing Deblurring +4

Image Restoration Using Joint Statistical Modeling in Space-Transform Domain

no code implementations11 May 2014 Jian Zhang, Debin Zhao, Ruiqin Xiong, Siwei Ma, Wen Gao

This paper presents a novel strategy for high-fidelity image restoration by characterizing both local smoothness and nonlocal self-similarity of natural images in a unified statistical manner.

Deblurring Image Deblurring +3

Image Compressive Sensing Recovery Using Adaptively Learned Sparsifying Basis via L0 Minimization

no code implementations30 Apr 2014 Jian Zhang, Chen Zhao, Debin Zhao, Wen Gao

From many fewer acquired measurements than suggested by the Nyquist sampling theory, compressive sensing (CS) theory demonstrates that, a signal can be reconstructed with high probability when it exhibits sparsity in some domain.

Compressive Sensing

Structural Group Sparse Representation for Image Compressive Sensing Recovery

no code implementations29 Apr 2014 Jian Zhang, Debin Zhao, Feng Jiang, Wen Gao

Compressive Sensing (CS) theory shows that a signal can be decoded from many fewer measurements than suggested by the Nyquist sampling theory, when the signal is sparse in some domain.

Compressive Sensing

Cannot find the paper you are looking for? You can Submit a new open access paper.