no code implementations • 6 Apr 2024 • Zhuoxu Huang, Zhenkun Fan, Tao Xu, Jungong Han
We introduce Motion PointNet composed of a PointNet-like encoder and a PDE-solving module.
no code implementations • 19 Mar 2024 • Yunqi Miao, Jiankang Deng, Jungong Han
Although diffusion models are rising as a powerful solution for blind face restoration, they are criticized for two problems: 1) slow training and inference speed, and 2) failure in preserving identity and recovering fine-grained facial details.
no code implementations • 14 Mar 2024 • Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding
Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.
1 code implementation • 13 Feb 2024 • Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed
To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.
no code implementations • 26 Dec 2023 • Mengyao Lyu, Yuhong Yang, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, Guiguang Ding
The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors.
no code implementations • 17 Dec 2023 • Tianxiang Hao, Mengyao Lyu, Hui Chen, Sicheng Zhao, Jungong Han, Guiguang Ding
On the other hand, complicated structures and update rules largely increase the computation and storage cost.
2 code implementations • 10 Dec 2023 • Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
Here, to achieve real-time segmenting anything on mobile devices, following MobileSAM, we replace the heavyweight image encoder in SAM with RepViT model, ending up with the RepViT-SAM model.
1 code implementation • 2 Dec 2023 • Changrui Chen, Jungong Han, Kurt Debattista
Due to the costliness of labelled data in real-world applications, semi-supervised learning, underpinned by pseudo labelling, is an appealing solution.
no code implementations • 27 Sep 2023 • Ao Wang, Hui Chen, Zijia Lin, Sicheng Zhao, Jungong Han, Guiguang Ding
We further employ a consistent dynamic channel pruning (CDCP) strategy to dynamically prune unimportant channels in ViTs.
1 code implementation • ICCV 2023 • Yutao Hu, Qixiong Wang, Wenqi Shao, Enze Xie, Zhenguo Li, Jungong Han, Ping Luo
In this paper, we address this issue from two perspectives.
7 code implementations • 18 Jul 2023 • Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
Recently, lightweight Vision Transformers (ViTs) demonstrate superior performance and lower latency, compared with lightweight Convolutional Neural Networks (CNNs), on resource-constrained mobile devices.
no code implementations • 17 Jul 2023 • Hao Chen, Yonghan Dong, Zheming Lu, Yunlong Yu, Yingming Li, Jungong Han, Zhongfei Zhang
Few-Shot Segmentation (FSS) aims to segment the novel class images with a few annotated samples.
1 code implementation • 1 Jul 2023 • Shaohui Lin, Wenxuan Huang, Jiao Xie, Baochang Zhang, Yunhang Shen, Zhou Yu, Jungong Han, David Doermann
In this paper, we propose a novel Knowledge-driven Differential Filter Sampler~(KDFS) with Masked Filter Modeling~(MFM) framework for filter pruning, which globally prunes the redundant filters based on the prior knowledge of a pre-trained model in a differential and non-alternative optimization.
no code implementations • 8 May 2023 • Yi Liu, Shoukun Xu, Dingwen Zhang, Jungong Han
Co-salient object detection targets at detecting co-existed salient objects among a group of images.
1 code implementation • CVPR 2023 • Zixuan Ding, Ao Wang, Hui Chen, Qiang Zhang, Pengzhang Liu, Yongjun Bao, Weipeng Yan, Jungong Han
In this paper, we advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior about the label-to-label correspondence via a semantic prior prompter.
no code implementations • CVPR 2023 • Tianlu Zhang, Hongyuan Guo, Qiang Jiao, Qiang Zhang, Jungong Han
Most current RGB-T trackers adopt a two-stream structure to extract unimodal RGB and thermal features and complex fusion strategies to achieve multi-modal feature fusion, which require a huge number of parameters, thus hindering their real-life applications.
Ranked #5 on Rgb-T Tracking on GTOT
no code implementations • 22 Nov 2022 • Yunqi Miao, Jiankang Deng, Guiguang Ding, Jungong Han
Since samples with high confidence are exclusively involved in the formation of centroids, the identity information of low-confidence samples, i. e., boundary samples, are NOT likely to contribute to the corresponding centroid.
1 code implementation • 11 Nov 2022 • Yunqi Miao, Alexandros Lattas, Jiankang Deng, Jungong Han, Stefanos Zafeiriou
Specifically, we reconstruct 3D face shape and reflectance from a large 2D facial dataset and introduce a novel method of transforming the VIS reflectance to NIR reflectance.
no code implementations • 3 Nov 2022 • Fan Yang, Xinhao Xu, Hui Chen, Yuchen Guo, Jungong Han, Kai Ni, Guiguang Ding
To pick up the ground plane prior for M3OD, we propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.
1 code implementation • 23 Oct 2022 • Zhuoxu Huang, Zhiyou Zhao, Banghuai Li, Jungong Han
Transformer with its underlying attention mechanism and the ability to capture long-range dependencies makes it become a natural choice for unordered point cloud data.
Ranked #1 on 3D Semantic Segmentation on SensatUrban
no code implementations • 1 Sep 2022 • Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei
In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features.
no code implementations • 8 Aug 2022 • Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding
Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video).
no code implementations • 21 Jul 2022 • Boyang xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han
For each category, the common pattern of it is employed as a query and the most salient frames are responded to it.
Ranked #5 on Action Recognition on ActivityNet
1 code implementation • 7 Jul 2022 • Changrui Chen, Kurt Debattista, Jungong Han
Due to the costliness of labelled data in real-world applications, semi-supervised object detectors, underpinned by pseudo labelling, are appealing.
1 code implementation • 30 May 2022 • Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding
For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models.
no code implementations • 29 Mar 2022 • De Cheng, Gerong Wang, Bo wang, Qiang Zhang, Jungong Han, Dingwen Zhang
This design makes the presented transformer model a hybrid of 1) top-down and bottom-up attention pathways and 2) dynamic and static routing pathways.
7 code implementations • CVPR 2022 • Xiaohan Ding, Xiangyu Zhang, Yizhuang Zhou, Jungong Han, Guiguang Ding, Jian Sun
We revisit large kernel design in modern convolutional neural networks (CNNs).
Ranked #73 on Image Classification on ImageNet
1 code implementation • 11 Jan 2022 • Yunqi Miao, Nianchang Huang, Xiao Ma, Qiang Zhang, Jungong Han
Visible-infrared person re-identification (VI-ReID) has been challenging due to the existence of large discrepancies between visible and infrared modalities.
no code implementations • 7 Jan 2022 • Dingwen Zhang, Guohai Huang, Qiang Zhang, Jungong Han, Junwei Han, Yizhou Yu
Recent advances in machine learning and prevalence of digital medical images have opened up an opportunity to address the challenging brain tumor segmentation (BTS) task by using deep convolutional neural networks.
no code implementations • CVPR 2022 • Qiang Zhang, Changzhou Lai, Jianan Liu, Nianchang Huang, Jungong Han
Then, a feature-level modality compensation module is present to generate those missing modality-specific features from existing modality-shared ones.
4 code implementations • CVPR 2022 • Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding
Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation.
Ranked #58 on Semantic Segmentation on Cityscapes val
1 code implementation • 19 Sep 2021 • Zerun Wang, Liuyu Xiang, Fan Yang, Jinzhao Qian, Jie Hu, Haidong Huang, Jungong Han, Yuchen Guo, Guiguang Ding
While recent deep deblurring algorithms have achieved remarkable progress, most existing methods focus on the global deblurring problem, where the image blur mostly arises from severe camera shake.
no code implementations • 3 Sep 2021 • Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Jungong Han
Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains.
2 code implementations • 30 Jul 2021 • Xiaohan Ding, Tianxiang Hao, Jungong Han, Yuchen Guo, Guiguang Ding
The existence of redundancy in Convolutional Neural Networks (CNNs) enables us to remove some filters/channels with acceptable performance drops.
no code implementations • CVPR 2021 • Qiang Zhang, Shenlu Zhao, Yongjiang Luo, Dingwen Zhang, Nianchang Huang, Jungong Han
Semantic segmentation models gain robustness against poor lighting conditions by virtue of complementary information from visible (RGB) and thermal images.
Ranked #27 on Thermal Image Segmentation on MFN Dataset
9 code implementations • 5 May 2021 • Xiaohan Ding, Chunlong Xia, Xiangyu Zhang, Xiaojie Chu, Jungong Han, Guiguang Ding
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.
Ranked #750 on Image Classification on ImageNet
no code implementations • 23 Apr 2021 • Nianchang Huang, Jianan Liu, Qiang Zhang, Jungong Han
Most existing cross-modality person re-identification works rely on discriminative modality-shared features for reducing cross-modality variations and intra-modality variations.
Cross-Modality Person Re-identification Person Re-Identification
no code implementations • 23 Apr 2021 • Nianchang Huang, Qiang Zhang, Jungong Han
The former one first uses two sub-networks to extract unimodal features from RGB and depth images, respectively, and then fuses them for SOD.
1 code implementation • NeurIPS 2020 • Dingwen Zhang, HaiBin Tian, Jungong Han
A fundamental challenge in training the existing deep saliency detection models is the requirement of large amounts of annotated data.
1 code implementation • 29 Mar 2021 • Dingwen Zhang, Bo wang, Gerong Wang, Qiang Zhang, Jiajia Zhang, Jungong Han, Zheng You
Onfocus detection aims at identifying whether the focus of the individual captured by a camera is on the camera or not.
2 code implementations • CVPR 2021 • Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs.
1 code implementation • 18 Feb 2021 • Chaowei Fang, HaiBin Tian, Dingwen Zhang, Qiang Zhang, Jungong Han, Junwei Han
To this end, this paper revisits the role of top-down modeling in salient object detection and designs a novel densely nested top-down flows (DNTDF)-based framework.
22 code implementations • CVPR 2021 • Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology.
Ranked #42 on Semantic Segmentation on Cityscapes val
1 code implementation • 28 Dec 2020 • Heng Liu, Jianyong Liu, Tao Tao, Shudong Hou, Jungong Han
Due to the limitations of sensors, the transmission medium and the intrinsic properties of ultrasound, the quality of ultrasound imaging is always not ideal, especially its low spatial resolution.
6 code implementations • ICCV 2021 • Xiaohan Ding, Tianxiang Hao, Jianchao Tan, Ji Liu, Jungong Han, Yuchen Guo, Guiguang Ding
Via training with regular SGD on the former but a novel update rule with penalty gradients on the latter, we realize structured sparsity.
no code implementations • 17 Jun 2020 • Yunqi Miao, Zijia Lin, Guiguang Ding, Jungong Han
In this paper, we propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images, which diminishes the impact of backgrounds via involving a shallow feature based attention model, and meanwhile, captures multi-scale information via densely connecting hierarchical image features.
1 code implementation • CVPR 2020 • Hui Chen, Guiguang Ding, Xudong Liu, Zijia Lin, Ji Liu, Jungong Han
Existing methods leverage the attention mechanism to explore such correspondence in a fine-grained manner.
Ranked #18 on Cross-Modal Retrieval on Flickr30k
no code implementations • ECCV 2020 • Yutao Hu, Xiao-Long Jiang, Xuhui Liu, Baochang Zhang, Jungong Han, Xian-Bin Cao, David Doermann
Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts.
no code implementations • 11 Feb 2020 • Xin Wang, Ruisheng Su, Weiyi Xie, Wenjin Wang, Yi Xu, Ritse Mann, Jungong Han, Tao Tan
Such performance gain is more pronounced with transfer learning or in the case of limited training data.
1 code implementation • ECCV 2020 • Liuyu Xiang, Guiguang Ding, Jungong Han
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
Ranked #25 on Long-tail Learning on Places-LT
no code implementations • 24 Oct 2019 • Chunlei Liu, Wenrui Ding, Jinyu Yang, Vittorio Murino, Baochang Zhang, Jungong Han, Guodong Guo
In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift.
4 code implementations • NeurIPS 2019 • Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu
Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices.
1 code implementation • CVPR 2020 • Yunlong Yu, Zhong Ji, Zhongfei Zhang, Jungong Han
We introduce a simple yet effective episode-based training framework for zero-shot learning (ZSL), where the learning system requires to recognize unseen classes given only the corresponding class semantics.
5 code implementations • ICCV 2019 • Xiaohan Ding, Yuchen Guo, Guiguang Ding, Jungong Han
We propose Asymmetric Convolution Block (ACB), an architecture-neutral structure as a CNN building block, which uses 1D asymmetric convolutions to strengthen the square convolution kernels.
no code implementations • 2 Jun 2019 • Liuyu Xiang, Xiaoming Jin, Guiguang Ding, Jungong Han, Leida Li
Pedestrian attribute recognition has received increasing attention due to its important role in video surveillance applications.
1 code implementation • 12 May 2019 • Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han, Chenggang Yan
It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i. e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment on computationally limited devices.
no code implementations • ICCV 2019 • Zhong Ji, Haoran Wang, Jungong Han, Yanwei Pang
Concretely, the saliency detector provides the visual saliency information as the guidance for the two attention modules.
1 code implementation • CVPR 2019 • Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han
The redundancy is widely recognized in Convolutional Neural Networks (CNNs), which enables to remove unimportant filters from convolutional layers so as to slim the network with acceptable performance drop.
no code implementations • 27 Jan 2019 • Jiaojiao Zhao, Jungong Han, Ling Shao, Cees G. M. Snoek
We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator.
no code implementations • 30 Nov 2018 • Jiaxin Gu, Ce Li, Baochang Zhang, Jungong Han, Xian-Bin Cao, Jianzhuang Liu, David Doermann
The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks.
no code implementations • 5 Aug 2018 • Jiaojiao Zhao, Li Liu, Cees G. M. Snoek, Jungong Han, Ling Shao
While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from the problems of context confusion and edge color bleeding.
no code implementations • CVPR 2018 • Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu
In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.
1 code implementation • 23 Apr 2018 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Changqing Zou, Jianzhuang Liu
Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence.
Ranked #89 on Skeleton Based Action Recognition on NTU RGB+D
no code implementations • 1 Apr 2018 • Baochang Zhang, Jiaxin Gu, Chen Chen, Jungong Han, Xiangbo Su, Xian-Bin Cao, Jianzhuang Liu
Compression artifacts reduction (CAR) is a challenging problem in the field of remote sensing.
1 code implementation • 1 Apr 2018 • Baochang Zhang, Lian Zhuo, Ze Wang, Jungong Han, Xian-Tong Zhen
Representation learning is a fundamental but challenging problem, especially when the distribution of data is unknown.
no code implementations • 6 Feb 2018 • Zhong Ji, Yuxin Sun, Yunlong Yu, Yanwei Pang, Jungong Han
To address the Cross-Modal Zero-Shot Hashing (CMZSH) retrieval task, we propose a novel Attribute-Guided Network (AgNet), which can perform not only IBIR, but also Text-Based Image Retrieval (TBIR).
no code implementations • 11 Nov 2017 • Baochang Zhang, Shangzhen Luan, Chen Chen, Jungong Han, Wei Wang, Alessandro Perina, Ling Shao
In this paper, we introduce an intermediate step -- solution sampling -- after the data sampling step to form a subspace, in which an optimal solution can be estimated.
1 code implementation • 12 Jul 2017 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han
Gesture recognition is a challenging problem in the field of biometrics.
Ranked #1 on Hand Gesture Recognition on MGB
no code implementations • 9 May 2017 • Ce Li, Chen Chen, Baochang Zhang, Qixiang Ye, Jungong Han, Rongrong Ji
Visual data such as videos are often sampled from complex manifold.
no code implementations • CVPR 2017 • Yang Long, Li Liu, Ling Shao, Fumin Shen, Guiguang Ding, Jungong Han
Using the proposed Unseen Visual Data Synthesis (UVDS) algorithm, semantic attributes are effectively utilised as an intermediate clue to synthesise unseen visual features at the training stage.
no code implementations • 3 May 2017 • Shangzhen Luan, Baochang Zhang, Chen Chen, Xian-Bin Cao, Jungong Han, Jianzhuang Liu
Steerable properties dominate the design of traditional filters, e. g., Gabor filters, and endow features the capability of dealing with spatial transformations.
no code implementations • 12 Feb 2017 • Qiang Zhang, Yi Liu, Rick S. Blum, Jungong Han, DaCheng Tao
As a result of several successful applications in computer vision and image processing, sparse representation (SR) has attracted significant attention in multi-sensor image fusion.
no code implementations • 7 Jun 2016 • Shangzhen Luan, Baochang Zhang, Jungong Han, Chen Chen, Ling Shao, Alessandro Perina, Linlin Shen
There is a neglected fact in the traditional machine learning methods that the data sampling can actually lead to the solution sampling.