1 code implementation • CVPR 2025 • Leqi Shen, Guoqiang Gong, Tianxiang Hao, Tao He, Yifeng Zhang, Pengzhang Liu, Sicheng Zhao, Jungong Han, Guiguang Ding
The parameter-efficient adaptation of the image-text pretraining model CLIP for video-text retrieval is a prominent area of research.
no code implementations • 7 Jun 2025 • Mingqi Gao, Haoran Duan, Tianlu Zhang, Jungong Han
In this report, we describe our approach to egocentric video object segmentation.
no code implementations • 26 May 2025 • Fengyuan Sun, Leqi Shen, Hui Chen, Sicheng Zhao, Jungong Han, Guiguang Ding
However, these scores exhibit inherent biases: global bias reflects a tendency to focus on the two ends of the visual token sequence, while local bias leads to an over-concentration on the same spatial positions across different frames.
no code implementations • 3 May 2025 • Xingyu Miao, Haoran Duan, Yang Long, Jungong Han
Essentially, UDS refines the gradient terms used in vanilla SDS methods, unifying them to support both tasks.
no code implementations • 15 Apr 2025 • Jingkun Chen, Haoran Duan, Xiao Zhang, Boyan Gao, Tao Tan, Vicente Grau, Jungong Han
To implement this, the teacher model first learns from gaze points enhanced by VLM-generated descriptions of lesion morphology, establishing a foundation for guiding the student model.
1 code implementation • 3 Apr 2025 • Yuan Zhou, Shilong Jin, Litao Hua, Wanjun Lv, Haoran Duan, Jungong Han
Recent advances in zero-shot text-to-3D generation have revolutionized 3D content creation by enabling direct synthesis from textual descriptions.
2 code implementations • CVPR 2025 • Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
Vision network designs, including Convolutional Neural Networks and Vision Transformers, have significantly advanced the field of computer vision.
no code implementations • 28 Mar 2025 • Yang Liu, Feixiang Liu, Jiale Du, Xinbo Gao, Jungong Han
Our UMMEC method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in TFSL.
no code implementations • 28 Mar 2025 • Yang Liu, Xun Zhang, Jiale Du, Xinbo Gao, Jungong Han
Zero-shot Learning(ZSL) attains knowledge transfer from seen classes to unseen classes by exploring auxiliary category information, which is a promising yet difficult research topic.
2 code implementations • 10 Mar 2025 • Ao Wang, Lihao Liu, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
Object detection and segmentation are widely employed in computer vision applications, yet conventional models like YOLO series, while efficient and accurate, are limited by predefined categories, hindering adaptability in open scenarios.
no code implementations • 18 Feb 2025 • Leiyu Pan, Zhenpeng Su, Minxuan Lv, Yizhe Xiong, Xiangwen Zhang, Zijia Lin, Hui Chen, Jungong Han, Guiguang Ding, Cheng Luo, Di Zhang, Kun Gai, Deyi Xiong
Moreover, we find that Finedeep achieves optimal results when balancing depth and width, specifically by adjusting the number of expert sub-layers and the number of experts per sub-layer.
no code implementations • 18 Feb 2025 • Minxuan Lv, Zhenpeng Su, Leiyu Pan, Yizhe Xiong, Zijia Lin, Hui Chen, Wei Zhou, Jungong Han, Guiguang Ding, Cheng Luo, Di Zhang, Kun Gai, Songlin Hu
As large language models continue to scale, computational costs and resource consumption have emerged as significant challenges.
no code implementations • 30 Jan 2025 • Haohan Shi, Fei Zhou, Xin Sun, Jungong Han
Furthermore, we demonstrate, for the first time, that the low-rank property of the learnable upsampling layer is a key bottleneck in lightweight SHSR methods.
Hyperspectral Image Super-Resolution
Image Super-Resolution
+1
no code implementations • 29 Jan 2025 • Jingkun Chen, Guang Yang, Xiao Zhang, Jingchao Peng, Tianlu Zhang, JianGuo Zhang, Jungong Han, Vicente Grau
Detecting novel anomalies in medical imaging is challenging due to the limited availability of labeled data for rare abnormalities, which often display high variability and subtlety.
1 code implementation • 18 Jan 2025 • Shanwen Wang, Changrui Chen, Xin Sun, Danfeng Hong, Jungong Han
To address these problems, this paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks.
1 code implementation • 30 Dec 2024 • Lihao Liu, Juexiao Feng, Hui Chen, Ao Wang, Lin Song, Jungong Han, Guiguang Ding
In this work, we introduce Universal Open-World Object Detection (Uni-OWD), a new paradigm that unifies open-vocabulary and open-world object detection tasks.
1 code implementation • 22 Dec 2024 • Yi Liu, Chengxin Li, Xiaohui Dong, Lei LI, Dingwen Zhang, Shoukun Xu, Jungong Han
To this end, inspired by the agreeable nature of binary segmentation for SOD and COD, we propose a Contrastive Distillation Paradigm (CDP) to distil the foreground from the background, facilitating the identification of salient and camouflaged objects amidst their surroundings.
1 code implementation • 8 Dec 2024 • Ao Wang, Fengyuan Sun, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
In this paper, we demonstrate that in MLLMs, the [CLS] token in the visual encoder inherently knows which visual tokens are important for MLLMs.
1 code implementation • 4 Dec 2024 • Ao Wang, Hui Chen, Jianchao Tan, Kefeng Zhang, Xunliang Cai, Zijia Lin, Jungong Han, Guiguang Ding
With an adaptive layer-wise KV retention recipe based on binary search, the maximum contextual information can thus be preserved in each layer, facilitating the generation.
no code implementations • 26 Nov 2024 • Hui-Yue Yang, Hui Chen, Ao Wang, Kai Chen, Zijia Lin, Yongliang Tang, Pengcheng Gao, Yuming Quan, Jungong Han, Guiguang Ding
Segment Anything Model (SAM) has made great progress in anomaly segmentation tasks due to its impressive generalization ability.
no code implementations • 11 Nov 2024 • Hongsheng Zhang, Zhong Ji, Jingren Liu, Yanwei Pang, Jungong Han
One is the popularly adopted single-teacher paradigm fails to impart comprehensive knowledge, The other is the existing methods inadequately leverage the multimodal information in the original training dataset, instead they rely on additional data for distillation, which increases computational and storage overhead.
no code implementations • 29 Oct 2024 • Zhong Ji, Shuo Yang, Jingren Liu, Yanwei Pang, Jungong Han
Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data.
1 code implementation • 19 Oct 2024 • Yi Liu, Chengxin Li, Shoukun Xu, Jungong Han
To tackle the challenge, in this paper, we propose a Part-Whole Relational Fusion (PWRF) framework.
1 code implementation • 10 Sep 2024 • Hui-Yue Yang, Hui Chen, Lihao Liu, Zijia Lin, Kai Chen, Liejun Wang, Jungong Han, Guiguang Ding
By incorporating the RASFormer block, our RAS method achieves superior contextual awareness capabilities, leading to remarkable performance.
Multi-class Anomaly Detection
Unsupervised Anomaly Detection
no code implementations • 14 Aug 2024 • Fan Yang, Sicheng Zhao, Yanhao Zhang, Haoxiang Chen, Hui Chen, Wenbo Tang, Haonan Lu, Pengfei Xu, Zhenyu Yang, Jungong Han, Guiguang Ding
Recent advancements in autonomous driving, augmented reality, robotics, and embodied intelligence have necessitated 3D perception algorithms.
no code implementations • 26 Jul 2024 • Mengyao Lyu, Tianxiang Hao, Xinhao Xu, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
In response, we present learn from the learnt (LFTL), a novel paradigm for SFADA to leverage the learnt knowledge from the source pretrained model and actively iterated models without extra overhead.
no code implementations • 24 Jul 2024 • Jingren Liu, Zhong Ji, Yunlong Yu, Jiale Cao, Yanwei Pang, Jungong Han, Xuelong Li
This work provides a theoretical foundation for understanding and improving PEFT-CL models, offering insights into the interplay between feature representation, task orthogonality, and generalization, contributing to the development of more efficient continual learning systems.
2 code implementations • 24 Jun 2024 • Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, YaoWei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Chengjing Wu, Ting Liu, Luoqi Liu, Xinyu Liu, Jing Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng, Bin Cao, Yisi Zhang, Xuanxu Lin, Xingjian He, Bo Zhao, Jing Liu, Feiyu Pan, Hao Fang, Xiankai Lu
Moreover, we provide a new motion expression guided video segmentation dataset MeViS to study the natural language-guided video understanding in complex environments.
1 code implementation • 11 Jun 2024 • Mingqi Gao, Jingnan Luo, Jinyu Yang, Jungong Han, Feng Zheng
Motion Expression guided Video Segmentation (MeViS), as an emerging task, poses many new challenges to the field of referring video object segmentation (RVOS).
3 code implementations • 23 May 2024 • Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding
In this work, we aim to further advance the performance-efficiency boundary of YOLOs from both the post-processing and model architecture.
Ranked #23 on
Real-Time Object Detection
on MS COCO
no code implementations • 6 May 2024 • Nianchang Huang, Yang Yang, Qiang Zhang, Jungong Han, Jin Huang
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD, ie more diverse modality discrepancies caused by varying modality types that need to be processed, and dynamic fusion design caused by an uncertain number of modalities present in the inputs of multimodal fusion strategy.
1 code implementation • 6 May 2024 • Nianchang Huang, Yang Yang, Ruida Xi, Qiang Zhang, Jungong Han, Jin Huang
The most prominent characteristics of AM SOD are that the modality types and modality numbers will be arbitrary or dynamically changed.
no code implementations • 27 Apr 2024 • Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Wei Huang, Jianwei Niu, Jungong Han, Guiguang Ding
In this paper, we propose the novel concept of Temporal Scaling Law, studying how the test loss of an LLM evolves as the training steps scale up.
no code implementations • 27 Apr 2024 • Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Hui Chen, Peng Liu, Jungong Han, Guiguang Ding
Since BPE iteratively merges the most frequent token pair in the text corpus to generate a new token and keeps all generated tokens in the vocabulary, it unavoidably holds tokens that primarily act as components of a longer token and appear infrequently on their own.
1 code implementation • 24 Apr 2024 • Zhong Ji, Yimu Su, Yan Zhang, Jiacheng Hou, Yanwei Pang, Jungong Han
Video Wire Inpainting (VWI) is a prominent application in video inpainting, aimed at flawlessly removing wires in films or TV series, offering significant time and labor savings compared to manual frame-by-frame removal.
1 code implementation • 6 Apr 2024 • Zhuoxu Huang, Zhenkun Fan, Tao Xu, Jungong Han
We introduce Motion PointNet composed of a PointNet-like encoder and a PDE-solving module.
no code implementations • CVPR 2024 • Yunqi Miao, Jiankang Deng, Jungong Han
Although diffusion models are rising as a powerful solution for blind face restoration, they are criticized for two problems: 1) slow training and inference speed, and 2) failure in preserving identity and recovering fine-grained facial details.
1 code implementation • 14 Mar 2024 • Yizhe Xiong, Hui Chen, Tianxiang Hao, Zijia Lin, Jungong Han, Yuesong Zhang, Guoxin Wang, Yongjun Bao, Guiguang Ding
Consequently, a simple combination of them cannot guarantee accomplishing both training efficiency and inference efficiency with minimal costs.
2 code implementations • 13 Feb 2024 • Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed
To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.
no code implementations • CVPR 2024 • Mengyao Lyu, Yuhong Yang, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, Guiguang Ding
The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors.
1 code implementation • 26 Dec 2023 • Mengyao Lyu, Yuhong Yang, Haiwen Hong, Hui Chen, Xuan Jin, Yuan He, Hui Xue, Jungong Han, Guiguang Ding
The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors.
no code implementations • 17 Dec 2023 • Tianxiang Hao, Mengyao Lyu, Hui Chen, Sicheng Zhao, Xiaohan Ding, Jungong Han, Guiguang Ding
To better understand the nature of prompt tuning, we propose the concept of ``Information Density'' (ID) to indicate whether a matrix strongly belongs to certain feature spaces rather than being evenly distributed across various feature spaces.
2 code implementations • 10 Dec 2023 • Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
Here, to achieve real-time segmenting anything on mobile devices, following MobileSAM, we replace the heavyweight image encoder in SAM with RepViT model, ending up with the RepViT-SAM model.
1 code implementation • 2 Dec 2023 • Changrui Chen, Jungong Han, Kurt Debattista
Due to the costliness of labelled data in real-world applications, semi-supervised learning, underpinned by pseudo labelling, is an appealing solution.
no code implementations • 27 Sep 2023 • Ao Wang, Hui Chen, Zijia Lin, Sicheng Zhao, Jungong Han, Guiguang Ding
We further employ a consistent dynamic channel pruning (CDCP) strategy to dynamically prune unimportant channels in ViTs.
1 code implementation • ICCV 2023 • Yutao Hu, Qixiong Wang, Wenqi Shao, Enze Xie, Zhenguo Li, Jungong Han, Ping Luo
In this paper, we address this issue from two perspectives.
8 code implementations • CVPR 2024 • Ao Wang, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
Recently, lightweight Vision Transformers (ViTs) demonstrate superior performance and lower latency, compared with lightweight Convolutional Neural Networks (CNNs), on resource-constrained mobile devices.
no code implementations • 17 Jul 2023 • Hao Chen, Yonghan Dong, Zheming Lu, Yunlong Yu, Yingming Li, Jungong Han, Zhongfei Zhang
Few-Shot Segmentation (FSS) aims to segment the novel class images with a few annotated samples.
1 code implementation • 1 Jul 2023 • Shaohui Lin, Wenxuan Huang, Jiao Xie, Baochang Zhang, Yunhang Shen, Zhou Yu, Jungong Han, David Doermann
In this paper, we propose a novel Knowledge-driven Differential Filter Sampler~(KDFS) with Masked Filter Modeling~(MFM) framework for filter pruning, which globally prunes the redundant filters based on the prior knowledge of a pre-trained model in a differential and non-alternative optimization.
no code implementations • 8 May 2023 • Yi Liu, Shoukun Xu, Dingwen Zhang, Jungong Han
Co-salient object detection targets at detecting co-existed salient objects among a group of images.
1 code implementation • CVPR 2023 • Zixuan Ding, Ao Wang, Hui Chen, Qiang Zhang, Pengzhang Liu, Yongjun Bao, Weipeng Yan, Jungong Han
In this paper, we advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior about the label-to-label correspondence via a semantic prior prompter.
no code implementations • CVPR 2023 • Tianlu Zhang, Hongyuan Guo, Qiang Jiao, Qiang Zhang, Jungong Han
Most current RGB-T trackers adopt a two-stream structure to extract unimodal RGB and thermal features and complex fusion strategies to achieve multi-modal feature fusion, which require a huge number of parameters, thus hindering their real-life applications.
Ranked #12 on
Rgb-T Tracking
on GTOT
no code implementations • 22 Nov 2022 • Yunqi Miao, Jiankang Deng, Guiguang Ding, Jungong Han
Since samples with high confidence are exclusively involved in the formation of centroids, the identity information of low-confidence samples, i. e., boundary samples, are NOT likely to contribute to the corresponding centroid.
1 code implementation • 11 Nov 2022 • Yunqi Miao, Alexandros Lattas, Jiankang Deng, Jungong Han, Stefanos Zafeiriou
Specifically, we reconstruct 3D face shape and reflectance from a large 2D facial dataset and introduce a novel method of transforming the VIS reflectance to NIR reflectance.
no code implementations • 3 Nov 2022 • Fan Yang, Xinhao Xu, Hui Chen, Yuchen Guo, Jungong Han, Kai Ni, Guiguang Ding
To pick up the ground plane prior for M3OD, we propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.
1 code implementation • 23 Oct 2022 • Zhuoxu Huang, Zhiyou Zhao, Banghuai Li, Jungong Han
Transformer with its underlying attention mechanism and the ability to capture long-range dependencies makes it become a natural choice for unordered point cloud data.
Ranked #1 on
3D Semantic Segmentation
on SensatUrban
no code implementations • 1 Sep 2022 • Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei
In DestFormer, the spatial and temporal dimensions of the 4D point cloud videos are decoupled to achieve efficient self-attention for learning both long-term and short-term features.
no code implementations • 8 Aug 2022 • Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding
Video-text retrieval (VTR) is an attractive yet challenging task for multi-modal understanding, which aims to search for relevant video (text) given a query (video).
no code implementations • 21 Jul 2022 • Boyang xia, Zhihao Wang, Wenhao Wu, Haoran Wang, Jungong Han
For each category, the common pattern of it is employed as a query and the most salient frames are responded to it.
Ranked #5 on
Action Recognition
on ActivityNet
1 code implementation • 7 Jul 2022 • Changrui Chen, Kurt Debattista, Jungong Han
Due to the costliness of labelled data in real-world applications, semi-supervised object detectors, underpinned by pseudo labelling, are appealing.
1 code implementation • 30 May 2022 • Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding
For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models.
no code implementations • 29 Mar 2022 • De Cheng, Gerong Wang, Bo wang, Qiang Zhang, Jungong Han, Dingwen Zhang
This design makes the presented transformer model a hybrid of 1) top-down and bottom-up attention pathways and 2) dynamic and static routing pathways.
8 code implementations • CVPR 2022 • Xiaohan Ding, Xiangyu Zhang, Yizhuang Zhou, Jungong Han, Guiguang Ding, Jian Sun
We revisit large kernel design in modern convolutional neural networks (CNNs).
Ranked #68 on
Image Classification
on ImageNet
1 code implementation • 11 Jan 2022 • Yunqi Miao, Nianchang Huang, Xiao Ma, Qiang Zhang, Jungong Han
Visible-infrared person re-identification (VI-ReID) has been challenging due to the existence of large discrepancies between visible and infrared modalities.
no code implementations • 7 Jan 2022 • Dingwen Zhang, Guohai Huang, Qiang Zhang, Jungong Han, Junwei Han, Yizhou Yu
Recent advances in machine learning and prevalence of digital medical images have opened up an opportunity to address the challenging brain tumor segmentation (BTS) task by using deep convolutional neural networks.
no code implementations • CVPR 2022 • Qiang Zhang, Changzhou Lai, Jianan Liu, Nianchang Huang, Jungong Han
Then, a feature-level modality compensation module is present to generate those missing modality-specific features from existing modality-shared ones.
4 code implementations • CVPR 2022 • Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding
Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation.
Ranked #70 on
Semantic Segmentation
on Cityscapes val
1 code implementation • 19 Sep 2021 • Zerun Wang, Liuyu Xiang, Fan Yang, Jinzhao Qian, Jie Hu, Haidong Huang, Jungong Han, Yuchen Guo, Guiguang Ding
While recent deep deblurring algorithms have achieved remarkable progress, most existing methods focus on the global deblurring problem, where the image blur mostly arises from severe camera shake.
no code implementations • 3 Sep 2021 • Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Jungong Han
Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains.
2 code implementations • 30 Jul 2021 • Xiaohan Ding, Tianxiang Hao, Jungong Han, Yuchen Guo, Guiguang Ding
The existence of redundancy in Convolutional Neural Networks (CNNs) enables us to remove some filters/channels with acceptable performance drops.
no code implementations • CVPR 2021 • Qiang Zhang, Shenlu Zhao, Yongjiang Luo, Dingwen Zhang, Nianchang Huang, Jungong Han
Semantic segmentation models gain robustness against poor lighting conditions by virtue of complementary information from visible (RGB) and thermal images.
Ranked #34 on
Thermal Image Segmentation
on MFN Dataset
10 code implementations • 5 May 2021 • Xiaohan Ding, Chunlong Xia, Xiangyu Zhang, Xiaojie Chu, Jungong Han, Guiguang Ding
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.
Ranked #818 on
Image Classification
on ImageNet
no code implementations • 23 Apr 2021 • Nianchang Huang, Qiang Zhang, Jungong Han
The former one first uses two sub-networks to extract unimodal features from RGB and depth images, respectively, and then fuses them for SOD.
no code implementations • 23 Apr 2021 • Nianchang Huang, Jianan Liu, Qiang Zhang, Jungong Han
Most existing cross-modality person re-identification works rely on discriminative modality-shared features for reducing cross-modality variations and intra-modality variations.
Cross-Modality Person Re-identification
Person Re-Identification
1 code implementation • NeurIPS 2020 • Dingwen Zhang, HaiBin Tian, Jungong Han
A fundamental challenge in training the existing deep saliency detection models is the requirement of large amounts of annotated data.
1 code implementation • 29 Mar 2021 • Dingwen Zhang, Bo wang, Gerong Wang, Qiang Zhang, Jiajia Zhang, Jungong Han, Zheng You
Onfocus detection aims at identifying whether the focus of the individual captured by a camera is on the camera or not.
3 code implementations • CVPR 2021 • Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs.
1 code implementation • 18 Feb 2021 • Chaowei Fang, HaiBin Tian, Dingwen Zhang, Qiang Zhang, Jungong Han, Junwei Han
To this end, this paper revisits the role of top-down modeling in salient object detection and designs a novel densely nested top-down flows (DNTDF)-based framework.
25 code implementations • CVPR 2021 • Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology.
Ranked #49 on
Semantic Segmentation
on Cityscapes val
1 code implementation • 28 Dec 2020 • Heng Liu, Jianyong Liu, Tao Tao, Shudong Hou, Jungong Han
Due to the limitations of sensors, the transmission medium and the intrinsic properties of ultrasound, the quality of ultrasound imaging is always not ideal, especially its low spatial resolution.
6 code implementations • ICCV 2021 • Xiaohan Ding, Tianxiang Hao, Jianchao Tan, Ji Liu, Jungong Han, Yuchen Guo, Guiguang Ding
Via training with regular SGD on the former but a novel update rule with penalty gradients on the latter, we realize structured sparsity.
no code implementations • 17 Jun 2020 • Yunqi Miao, Zijia Lin, Guiguang Ding, Jungong Han
In this paper, we propose a Shallow feature based Dense Attention Network (SDANet) for crowd counting from still images, which diminishes the impact of backgrounds via involving a shallow feature based attention model, and meanwhile, captures multi-scale information via densely connecting hierarchical image features.
1 code implementation • CVPR 2020 • Hui Chen, Guiguang Ding, Xudong Liu, Zijia Lin, Ji Liu, Jungong Han
Existing methods leverage the attention mechanism to explore such correspondence in a fine-grained manner.
Ranked #20 on
Cross-Modal Retrieval
on Flickr30k
no code implementations • ECCV 2020 • Yutao Hu, Xiao-Long Jiang, Xuhui Liu, Baochang Zhang, Jungong Han, Xian-Bin Cao, David Doermann
Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts.
1 code implementation • 11 Feb 2020 • Xin Wang, Ruisheng Su, Weiyi Xie, Wenjin Wang, Yi Xu, Ritse Mann, Jungong Han, Tao Tan
Such performance gain is more pronounced with transfer learning or in the case of limited training data.
1 code implementation • ECCV 2020 • Liuyu Xiang, Guiguang Ding, Jungong Han
We refer to these models as 'Experts', and the proposed LFME framework aggregates the knowledge from multiple 'Experts' to learn a unified student model.
Ranked #27 on
Long-tail Learning
on Places-LT
no code implementations • 24 Oct 2019 • Chunlei Liu, Wenrui Ding, Jinyu Yang, Vittorio Murino, Baochang Zhang, Jungong Han, Guodong Guo
In this paper, we propose a novel aggregation signature suitable for small object tracking, especially aiming for the challenge of sudden and large drift.
4 code implementations • NeurIPS 2019 • Xiaohan Ding, Guiguang Ding, Xiangxin Zhou, Yuchen Guo, Jungong Han, Ji Liu
Deep Neural Network (DNN) is powerful but computationally expensive and memory intensive, thus impeding its practical usage on resource-constrained front-end devices.
1 code implementation • CVPR 2020 • Yunlong Yu, Zhong Ji, Zhongfei Zhang, Jungong Han
We introduce a simple yet effective episode-based training framework for zero-shot learning (ZSL), where the learning system requires to recognize unseen classes given only the corresponding class semantics.
5 code implementations • ICCV 2019 • Xiaohan Ding, Yuchen Guo, Guiguang Ding, Jungong Han
We propose Asymmetric Convolution Block (ACB), an architecture-neutral structure as a CNN building block, which uses 1D asymmetric convolutions to strengthen the square convolution kernels.
no code implementations • 2 Jun 2019 • Liuyu Xiang, Xiaoming Jin, Guiguang Ding, Jungong Han, Leida Li
Pedestrian attribute recognition has received increasing attention due to its important role in video surveillance applications.
1 code implementation • 12 May 2019 • Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han, Chenggang Yan
It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i. e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment on computationally limited devices.
no code implementations • ICCV 2019 • Zhong Ji, Haoran Wang, Jungong Han, Yanwei Pang
Concretely, the saliency detector provides the visual saliency information as the guidance for the two attention modules.
1 code implementation • CVPR 2019 • Xiaohan Ding, Guiguang Ding, Yuchen Guo, Jungong Han
The redundancy is widely recognized in Convolutional Neural Networks (CNNs), which enables to remove unimportant filters from convolutional layers so as to slim the network with acceptable performance drop.
no code implementations • 27 Jan 2019 • Jiaojiao Zhao, Jungong Han, Ling Shao, Cees G. M. Snoek
We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator.
no code implementations • 30 Nov 2018 • Jiaxin Gu, Ce Li, Baochang Zhang, Jungong Han, Xian-Bin Cao, Jianzhuang Liu, David Doermann
The advancement of deep convolutional neural networks (DCNNs) has driven significant improvement in the accuracy of recognition systems for many computer vision tasks.
no code implementations • 5 Aug 2018 • Jiaojiao Zhao, Li Liu, Cees G. M. Snoek, Jungong Han, Ling Shao
While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from the problems of context confusion and edge color bleeding.
no code implementations • CVPR 2018 • Xiaodi Wang, Baochang Zhang, Ce Li, Rongrong Ji, Jungong Han, Xian-Bin Cao, Jianzhuang Liu
In this paper, we propose new Modulated Convolutional Networks (MCNs) to improve the portability of CNNs via binarized filters.
1 code implementation • 23 Apr 2018 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Changqing Zou, Jianzhuang Liu
Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence.
Ranked #104 on
Skeleton Based Action Recognition
on NTU RGB+D
no code implementations • 1 Apr 2018 • Baochang Zhang, Jiaxin Gu, Chen Chen, Jungong Han, Xiangbo Su, Xian-Bin Cao, Jianzhuang Liu
Compression artifacts reduction (CAR) is a challenging problem in the field of remote sensing.
1 code implementation • 1 Apr 2018 • Baochang Zhang, Lian Zhuo, Ze Wang, Jungong Han, Xian-Tong Zhen
Representation learning is a fundamental but challenging problem, especially when the distribution of data is unknown.
no code implementations • 6 Feb 2018 • Zhong Ji, Yuxin Sun, Yunlong Yu, Yanwei Pang, Jungong Han
To address the Cross-Modal Zero-Shot Hashing (CMZSH) retrieval task, we propose a novel Attribute-Guided Network (AgNet), which can perform not only IBIR, but also Text-Based Image Retrieval (TBIR).
no code implementations • 11 Nov 2017 • Baochang Zhang, Shangzhen Luan, Chen Chen, Jungong Han, Wei Wang, Alessandro Perina, Ling Shao
In this paper, we introduce an intermediate step -- solution sampling -- after the data sampling step to form a subspace, in which an optimal solution can be estimated.
1 code implementation • 12 Jul 2017 • Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han
Gesture recognition is a challenging problem in the field of biometrics.
Ranked #1 on
Hand Gesture Recognition
on MGB
no code implementations • 9 May 2017 • Ce Li, Chen Chen, Baochang Zhang, Qixiang Ye, Jungong Han, Rongrong Ji
Visual data such as videos are often sampled from complex manifold.
no code implementations • CVPR 2017 • Yang Long, Li Liu, Ling Shao, Fumin Shen, Guiguang Ding, Jungong Han
Using the proposed Unseen Visual Data Synthesis (UVDS) algorithm, semantic attributes are effectively utilised as an intermediate clue to synthesise unseen visual features at the training stage.
no code implementations • 3 May 2017 • Shangzhen Luan, Baochang Zhang, Chen Chen, Xian-Bin Cao, Jungong Han, Jianzhuang Liu
Steerable properties dominate the design of traditional filters, e. g., Gabor filters, and endow features the capability of dealing with spatial transformations.
no code implementations • 12 Feb 2017 • Qiang Zhang, Yi Liu, Rick S. Blum, Jungong Han, DaCheng Tao
As a result of several successful applications in computer vision and image processing, sparse representation (SR) has attracted significant attention in multi-sensor image fusion.
no code implementations • 7 Jun 2016 • Shangzhen Luan, Baochang Zhang, Jungong Han, Chen Chen, Ling Shao, Alessandro Perina, Linlin Shen
There is a neglected fact in the traditional machine learning methods that the data sampling can actually lead to the solution sampling.