Search Results for author: Jungong Han

Found 73 papers, 34 papers with code

WaveFace: Authentic Face Restoration with Efficient Frequency Recovery

no code implementations19 Mar 2024 Yunqi Miao, Jiankang Deng, Jungong Han

Although diffusion models are rising as a powerful solution for blind face restoration, they are criticized for two problems: 1) slow training and inference speed, and 2) failure in preserving identity and recovering fine-grained facial details.

Blind Face Restoration Denoising

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

no code implementations14 Mar 2024 Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding

Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.

Model Compression

Pixel Sentence Representation Learning

1 code implementation13 Feb 2024 Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed

To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.

Natural Language Inference Representation Learning +3

One-Dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications

no code implementations26 Dec 2023 Mengyao Lyu, Yuhong Yang, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, Guiguang Ding

The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors.

Text-to-Image Generation

RepViT-SAM: Towards Real-Time Segmenting Anything

2 code implementations10 Dec 2023 Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding

Here, to achieve real-time segmenting anything on mobile devices, following MobileSAM, we replace the heavyweight image encoder in SAM with RepViT model, ending up with the RepViT-SAM model.

Virtual Category Learning: A Semi-Supervised Learning Method for Dense Prediction with Extremely Limited Labels

1 code implementation2 Dec 2023 Changrui Chen, Jungong Han, Kurt Debattista

Due to the costliness of labelled data in real-world applications, semi-supervised learning, underpinned by pseudo labelling, is an appealing solution.

object-detection Object Detection +1

RepViT: Revisiting Mobile CNN From ViT Perspective

7 code implementations18 Jul 2023 Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding

Recently, lightweight Vision Transformers (ViTs) demonstrate superior performance and lower latency, compared with lightweight Convolutional Neural Networks (CNNs), on resource-constrained mobile devices.

Dense Affinity Matching for Few-Shot Segmentation

no code implementations17 Jul 2023 Hao Chen, Yonghan Dong, Zheming Lu, Yunlong Yu, Yingming Li, Jungong Han, Zhongfei Zhang

Few-Shot Segmentation (FSS) aims to segment the novel class images with a few annotated samples.

Few-Shot Semantic Segmentation

Filter Pruning for Efficient CNNs via Knowledge-driven Differential Filter Sampler

1 code implementation1 Jul 2023 Shaohui Lin, Wenxuan Huang, Jiao Xie, Baochang Zhang, Yunhang Shen, Zhou Yu, Jungong Han, David Doermann

In this paper, we propose a novel Knowledge-driven Differential Filter Sampler~(KDFS) with Masked Filter Modeling~(MFM) framework for filter pruning, which globally prunes the redundant filters based on the prior knowledge of a pre-trained model in a differential and non-alternative optimization.

Image Classification Network Pruning

SegGPT Meets Co-Saliency Scene

no code implementations8 May 2023 Yi Liu, Shoukun Xu, Dingwen Zhang, Jungong Han

Co-salient object detection targets at detecting co-existed salient objects among a group of images.

Co-Salient Object Detection Object +2

Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels

1 code implementation CVPR 2023 Zixuan Ding, Ao Wang, Hui Chen, Qiang Zhang, Pengzhang Liu, Yongjun Bao, Weipeng Yan, Jungong Han

In this paper, we advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior about the label-to-label correspondence via a semantic prior prompter.

Language Modelling Self-Supervised Learning +1

Efficient RGB-T Tracking via Cross-Modality Distillation

no code implementations CVPR 2023 Tianlu Zhang, Hongyuan Guo, Qiang Jiao, Qiang Zhang, Jungong Han

Most current RGB-T trackers adopt a two-stream structure to extract unimodal RGB and thermal features and complex fusion strategies to achieve multi-modal feature fusion, which require a huge number of parameters, thus hindering their real-life applications.

Rgb-T Tracking

Confidence-guided Centroids for Unsupervised Person Re-Identification

no code implementations22 Nov 2022 Yunqi Miao, Jiankang Deng, Guiguang Ding, Jungong Han

Since samples with high confidence are exclusively involved in the formation of centroids, the identity information of low-confidence samples, i. e., boundary samples, are NOT likely to contribute to the corresponding centroid.

Pseudo Label Retrieval +1

Physically-Based Face Rendering for NIR-VIS Face Recognition

1 code implementation11 Nov 2022 Yunqi Miao, Alexandros Lattas, Jiankang Deng, Jungong Han, Stefanos Zafeiriou

Specifically, we reconstruct 3D face shape and reflectance from a large 2D facial dataset and introduce a novel method of transforming the VIS reflectance to NIR reflectance.

Face Recognition Image Generation

Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D Object Detection

no code implementations3 Nov 2022 Fan Yang, Xinhao Xu, Hui Chen, Yuchen Guo, Jungong Han, Kai Ni, Guiguang Ding

To pick up the ground plane prior for M3OD, we propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.

Monocular 3D Object Detection object-detection

LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context Propagation in Transformers

1 code implementation23 Oct 2022 Zhuoxu Huang, Zhiyou Zhao, Banghuai Li, Jungong Han

Transformer with its underlying attention mechanism and the ability to capture long-range dependencies makes it become a natural choice for unordered point cloud data.

3D Object Detection 3D Point Cloud Classification +3

MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition

no code implementations1 Sep 2022 Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei

In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features.

Action Recognition

Boosting Video-Text Retrieval with Explicit High-Level Semantics

no code implementations8 Aug 2022 Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding

Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video).

Retrieval Text Retrieval +3

Temporal Saliency Query Network for Efficient Video Recognition

no code implementations21 Jul 2022 Boyang xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han

For each category, the common pattern of it is employed as a query and the most salient frames are responded to it.

Action Recognition Video Recognition

Semi-supervised Object Detection via Virtual Category Learning

1 code implementation7 Jul 2022 Changrui Chen, Kurt Debattista, Jungong Han

Due to the costliness of labelled data in real-world applications, semi-supervised object detectors, underpinned by pseudo labelling, are appealing.

Object object-detection +2

Re-parameterizing Your Optimizers rather than Architectures

1 code implementation30 May 2022 Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding

For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models.

Quantization

Hybrid Routing Transformer for Zero-Shot Learning

no code implementations29 Mar 2022 De Cheng, Gerong Wang, Bo wang, Qiang Zhang, Jungong Han, Dingwen Zhang

This design makes the presented transformer model a hybrid of 1) top-down and bottom-up attention pathways and 2) dynamic and static routing pathways.

Attribute Zero-Shot Learning

On Exploring Pose Estimation as an Auxiliary Learning Task for Visible-Infrared Person Re-identification

1 code implementation11 Jan 2022 Yunqi Miao, Nianchang Huang, Xiao Ma, Qiang Zhang, Jungong Han

Visible-infrared person re-identification (VI-ReID) has been challenging due to the existence of large discrepancies between visible and infrared modalities.

Auxiliary Learning Knowledge Distillation +2

Cross-Modality Deep Feature Learning for Brain Tumor Segmentation

no code implementations7 Jan 2022 Dingwen Zhang, Guohai Huang, Qiang Zhang, Jungong Han, Junwei Han, Yizhou Yu

Recent advances in machine learning and prevalence of digital medical images have opened up an opportunity to address the challenging brain tumor segmentation (BTS) task by using deep convolutional neural networks.

Brain Tumor Segmentation Segmentation +1

FMCNet: Feature-Level Modality Compensation for Visible-Infrared Person Re-Identification

no code implementations CVPR 2022 Qiang Zhang, Changzhou Lai, Jianan Liu, Nianchang Huang, Jungong Han

Then, a feature-level modality compensation module is present to generate those missing modality-specific features from existing modality-shared ones.

Person Re-Identification

RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality

4 code implementations CVPR 2022 Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding

Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation.

Image Classification Semantic Segmentation

LODE: Deep Local Deblurring and A New Benchmark

1 code implementation19 Sep 2021 Zerun Wang, Liuyu Xiang, Fan Yang, Jinzhao Qian, Jie Hu, Haidong Huang, Jungong Han, Yuchen Guo, Guiguang Ding

While recent deep deblurring algorithms have achieved remarkable progress, most existing methods focus on the global deblurring problem, where the image blur mostly arises from severe camera shake.

Deblurring

Information Symmetry Matters: A Modal-Alternating Propagation Network for Few-Shot Learning

no code implementations3 Sep 2021 Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Jungong Han

Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains.

Attribute Few-Shot Learning

Manipulating Identical Filter Redundancy for Efficient Pruning on Deep and Complicated CNN

2 code implementations30 Jul 2021 Xiaohan Ding, Tianxiang Hao, Jungong Han, Yuchen Guo, Guiguang Ding

The existence of redundancy in Convolutional Neural Networks (CNNs) enables us to remove some filters/channels with acceptable performance drops.

Network Pruning

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

9 code implementations5 May 2021 Xiaohan Ding, Chunlong Xia, Xiangyu Zhang, Xiaojie Chu, Jungong Han, Guiguang Ding

We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.

Face Recognition Image Classification +1

Exploring Modality-shared Appearance Features and Modality-invariant Relation Features for Cross-modality Person Re-Identification

no code implementations23 Apr 2021 Nianchang Huang, Jianan Liu, Qiang Zhang, Jungong Han

Most existing cross-modality person re-identification works rely on discriminative modality-shared features for reducing cross-modality variations and intra-modality variations.

Cross-Modality Person Re-identification Person Re-Identification

Middle-level Fusion for Lightweight RGB-D Salient Object Detection

no code implementations23 Apr 2021 Nianchang Huang, Qiang Zhang, Jungong Han

The former one first uses two sub-networks to extract unimodal features from RGB and depth images, respectively, and then fuses them for SOD.

object-detection RGB-D Salient Object Detection +1

Few-Cost Salient Object Detection with Adversarial-Paced Learning

1 code implementation NeurIPS 2020 Dingwen Zhang, HaiBin Tian, Jungong Han

A fundamental challenge in training the existing deep saliency detection models is the requirement of large amounts of annotated data.

Object object-detection +3

Onfocus Detection: Identifying Individual-Camera Eye Contact from Unconstrained Images

1 code implementation29 Mar 2021 Dingwen Zhang, Bo wang, Gerong Wang, Qiang Zhang, Jiajia Zhang, Jungong Han, Zheng You

Onfocus detection aims at identifying whether the focus of the individual captured by a camera is on the camera or not.

Diverse Branch Block: Building a Convolution as an Inception-like Unit

2 code implementations CVPR 2021 Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding

We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs.

Image Classification object-detection +2

Densely Nested Top-Down Flows for Salient Object Detection

1 code implementation18 Feb 2021 Chaowei Fang, HaiBin Tian, Dingwen Zhang, Qiang Zhang, Jungong Han, Junwei Han

To this end, this paper revisits the role of top-down modeling in salient object detection and designs a novel densely nested top-down flows (DNTDF)-based framework.

Object object-detection +2

RepVGG: Making VGG-style ConvNets Great Again

22 code implementations CVPR 2021 Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun

We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology.

Image Classification Semantic Segmentation

Perception Consistency Ultrasound Image Super-resolution via Self-supervised CycleGAN

1 code implementation28 Dec 2020 Heng Liu, Jianyong Liu, Tao Tao, Shudong Hou, Jungong Han

Due to the limitations of sensors, the transmission medium and the intrinsic properties of ultrasound, the quality of ultrasound imaging is always not ideal, especially its low spatial resolution.

Generative Adversarial Network Image Enhancement +2

ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting

6 code implementations ICCV 2021 Xiaohan Ding, Tianxiang Hao, Jianchao Tan, Ji Liu, Jungong Han, Yuchen Guo, Guiguang Ding

Via training with regular SGD on the former but a novel update rule with penalty gradients on the latter, we realize structured sparsity.

Shallow Feature Based Dense Attention Network for Crowd Counting

no code implementations17 Jun 2020 Yunqi Miao, Zijia Lin, Guiguang Ding, Jungong Han

In this paper, we propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images, which diminishes the impact of backgrounds via involving a shallow feature based attention model, and meanwhile, captures multi-scale information via densely connecting hierarchical image features.

Crowd Counting

NAS-Count: Counting-by-Density with Neural Architecture Search

no code implementations ECCV 2020 Yutao Hu, Xiao-Long Jiang, Xuhui Liu, Baochang Zhang, Jungong Han, Xian-Bin Cao, David Doermann

Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts.

Crowd Counting Density Estimation +1

Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification

1 code implementation ECCV 2020 Liuyu Xiang, Guiguang Ding, Jungong Han

We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.

General Classification Knowledge Distillation +1

Aggregation Signature for Small Object Tracking

no code implementations24 Oct 2019 Chunlei Liu, Wenrui Ding, Jinyu Yang, Vittorio Murino, Baochang Zhang, Jungong Han, Guodong Guo

In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift.

Object Object Tracking

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

4 code implementations NeurIPS 2019 Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu

Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices.

Model Compression

Episode-based Prototype Generating Network for Zero-Shot Learning

1 code implementation CVPR 2020 Yunlong Yu, Zhong Ji, Zhongfei Zhang, Jungong Han

We introduce a simple yet effective episode-based training framework for zero-shot learning (ZSL), where the learning system requires to recognize unseen classes given only the corresponding class semantics.

Zero-Shot Learning

ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks

5 code implementations ICCV 2019 Xiaohan Ding, Yuchen Guo, Guiguang Ding, Jungong Han

We propose Asymmetric Convolution Block (ACB), an architecture-neutral structure as a CNN building block, which uses 1D asymmetric convolutions to strengthen the square convolution kernels.

Attribute

Incremental Few-Shot Learning for Pedestrian Attribute Recognition

no code implementations2 Jun 2019 Liuyu Xiang, Xiaoming Jin, Guiguang Ding, Jungong Han, Leida Li

Pedestrian attribute recognition has received increasing attention due to its important role in video surveillance applications.

Attribute Few-Shot Learning +1

Approximated Oracle Filter Pruning for Destructive CNN Width Optimization

1 code implementation12 May 2019 Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han, Chenggang Yan

It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i. e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment on computationally limited devices.

Saliency-Guided Attention Network for Image-Sentence Matching

no code implementations ICCV 2019 Zhong Ji, Haoran Wang, Jungong Han, Yanwei Pang

Concretely, the saliency detector provides the visual saliency information as the guidance for the two attention modules.

Sentence

Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure

1 code implementation CVPR 2019 Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han

The redundancy is widely recognized in Convolutional Neural Networks (CNNs), which enables to remove unimportant filters from convolutional layers so as to slim the network with acceptable performance drop.

Pixelated Semantic Colorization

no code implementations27 Jan 2019 Jiaojiao Zhao, Jungong Han, Ling Shao, Cees G. M. Snoek

We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator.

Colorization Image Colorization +2

Projection Convolutional Neural Networks for 1-bit CNNs via Discrete Back Propagation

no code implementations30 Nov 2018 Jiaxin Gu, Ce Li, Baochang Zhang, Jungong Han, Xian-Bin Cao, Jianzhuang Liu, David Doermann

The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks.

Pixel-level Semantics Guided Image Colorization

no code implementations5 Aug 2018 Jiaojiao Zhao, Li Liu, Cees G. M. Snoek, Jungong Han, Ling Shao

While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from the problems of context confusion and edge color bleeding.

Colorization Image Colorization +2

Modulated Convolutional Networks

no code implementations CVPR 2018 Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu

In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.

Memory Attention Networks for Skeleton-based Action Recognition

1 code implementation23 Apr 2018 Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Changqing Zou, Jianzhuang Liu

Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence.

Action Recognition Skeleton Based Action Recognition +1

The Structure Transfer Machine Theory and Applications

1 code implementation1 Apr 2018 Baochang Zhang, Lian Zhuo, Ze Wang, Jungong Han, Xian-Tong Zhen

Representation learning is a fundamental but challenging problem, especially when the distribution of data is unknown.

Image Classification Object Tracking +1

Attribute-Guided Network for Cross-Modal Zero-Shot Hashing

no code implementations6 Feb 2018 Zhong Ji, Yuxin Sun, Yunlong Yu, Yanwei Pang, Jungong Han

To address the Cross-Modal Zero-Shot Hashing (CMZSH) retrieval task, we propose a novel Attribute-Guided Network (AgNet), which can perform not only IBIR, but also Text-Based Image Retrieval (TBIR).

Attribute Cross-Modal Retrieval +3

Latent Constrained Correlation Filter

no code implementations11 Nov 2017 Baochang Zhang, Shangzhen Luan, Chen Chen, Jungong Han, Wei Wang, Alessandro Perina, Ling Shao

In this paper, we introduce an intermediate step -- solution sampling -- after the data sampling step to form a subspace, in which an optimal solution can be estimated.

Object Recognition Object Tracking

From Zero-shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis

no code implementations CVPR 2017 Yang Long, Li Liu, Ling Shao, Fumin Shen, Guiguang Ding, Jungong Han

Using the proposed Unseen Visual Data Synthesis (UVDS) algorithm, semantic attributes are effectively utilised as an intermediate clue to synthesise unseen visual features at the training stage.

General Classification Object Recognition +1

Gabor Convolutional Networks

no code implementations3 May 2017 Shangzhen Luan, Baochang Zhang, Chen Chen, Xian-Bin Cao, Jungong Han, Jianzhuang Liu

Steerable properties dominate the design of traditional filters, e. g., Gabor filters, and endow features the capability of dealing with spatial transformations.

Sparse Representation based Multi-sensor Image Fusion: A Review

no code implementations12 Feb 2017 Qiang Zhang, Yi Liu, Rick S. Blum, Jungong Han, DaCheng Tao

As a result of several successful applications in computer vision and image processing, sparse representation (SR) has attracted significant attention in multi-sensor image fusion.

Dictionary Learning Infrared And Visible Image Fusion

Latent Constrained Correlation Filters for Object Localization

no code implementations7 Jun 2016 Shangzhen Luan, Baochang Zhang, Jungong Han, Chen Chen, Ling Shao, Alessandro Perina, Linlin Shen

There is a neglected fact in the traditional machine learning methods that the data sampling can actually lead to the solution sampling.

Object Object Localization

Cannot find the paper you are looking for? You can Submit a new open access paper.