no code implementations • 12 Dec 2024 • Junjie Zhou, Ke Zhu, Jianxin Wu
Knowledge Distillation (KD) is essential in transferring dark knowledge from a large teacher to a small student network, such that the student can be much more efficient than the teacher but with comparable accuracy.
no code implementations • 21 Nov 2024 • Minghao Fu, Hao Yu, Jie Shao, Junjie Zhou, Ke Zhu, Jianxin Wu
Deep neural networks, while achieving remarkable success across diverse tasks, demand significant resources, including computation, GPU memory, bandwidth, storage, and energy.
no code implementations • 19 Nov 2024 • Jie Shao, Hanxiao Zhang, Jianxin Wu
In this work, we explore the quantization of diffusion models in extreme compression regimes to reduce model size while maintaining performance.
no code implementations • 25 Jun 2024 • Ningyuan Tang, Minghao Fu, Jianxin Wu
The rapid scaling of large vision pretrained models makes fine-tuning tasks more and more difficult on edge devices with low computational resources.
no code implementations • 11 Jun 2024 • Hao Yu, Zelan Yang, Shen Li, Yong Li, Jianxin Wu
The advent of pre-trained large language models (LLMs) has revolutionized various natural language processing tasks.
1 code implementation • 28 May 2024 • Hao Yu, Minghao Fu, Jiandong Ding, Yusheng Zhou, Jianxin Wu
To address these challenges, we propose a unified low-rank decomposition framework for compressing CTR prediction models.
no code implementations • 30 Apr 2024 • Yun-Hao Cao, Jianxin Wu
In this paper, we propose an efficient single-branch SSL method based on non-parametric instance discrimination, aiming to improve the algorithm, model, and data efficiency of SSL.
no code implementations • CVPR 2024 • Hanxiao Zhang, Yifan Zhou, Guo-Hua Wang, Jianxin Wu
In particular, the issue of sparse compression exists in traditional CNN few-shot methods, which can only produce very few compressed models of different model sizes.
no code implementations • 8 Mar 2024 • Jie Shao, Ke Zhu, Hanxiao Zhang, Jianxin Wu
This paper proposes a new pipeline for long-tail (LT) recognition.
no code implementations • 6 Feb 2024 • Ningyuan Tang, Minghao Fu, Ke Zhu, Jianxin Wu
Because learnable parameters from these methods are entangled with the pretrained model, gradients related to the frozen pretrained model's parameters have to be computed and stored during finetuning.
2 code implementations • 30 Jan 2024 • Hao Yu, Yingxiao Du, Jianxin Wu
In this paper, we aim to enhance the accuracy of the worst-performing categories and utilize the harmonic mean and geometric mean to assess the model's performance.
no code implementations • 29 Jan 2024 • Ke Zhu, Minghao Fu, Jie Shao, Tianyu Liu, Jianxin Wu
While existing methods fail to handle the regression bias, the class-specific regression head for rare classes is hypothesized to be the main cause of it in this paper.
1 code implementation • 13 Dec 2023 • Minghao Fu, Ke Zhu, Jianxin Wu
When pre-trained models become rapidly larger, the cost of fine-tuning on downstream tasks steadily increases, too.
no code implementations • ICCV 2023 • Ke Zhu, Minghao Fu, Jianxin Wu
Self-supervised learning (SSL) methods targeting scene images have seen a rapid growth recently, and they mostly rely on either a dedicated dense matching mechanism or a costly unsupervised object discovery module.
no code implementations • 20 Jul 2023 • Ke Zhu, Yin-Yin He, Jianxin Wu
QFD first trains a quantized (or binarized) representation as the teacher, then quantize the network using knowledge distillation (KD).
no code implementations • 7 Jun 2023 • Ke Zhu, Yin-Yin He, Jianxin Wu
That is, coarse crops benefits scene images SSL.
no code implementations • CVPR 2024 • Minghao Fu, Ke Zhu, Jianxin Wu
With both the new pFSL setting and novel IbM2 method, this paper shows that practical few-shot learning is both viable and promising.
no code implementations • CVPR 2023 • Yingxiao Du, Jianxin Wu
The convention in long-tailed recognition is to manually split all categories into three subsets and report the average accuracy within each subset.
1 code implementation • 2 Mar 2023 • Guo-Hua Wang, Jianxin Wu
For 22% latency reduction, it surpasses previous methods by on average 7 percentage points on ImageNet-1k.
1 code implementation • 12 Jul 2022 • Yun-Hao Cao, Peiqin Sun, Yechang Huang, Jianxin Wu, Shuchang Zhou
In this paper, we propose a method called synergistic self-supervised and quantization learning (SSQL) to pretrain quantization-friendly self-supervised models facilitating downstream deployment.
1 code implementation • 13 Mar 2022 • Minghao Fu, Yun-Hao Cao, Jianxin Wu
Few-shot recognition learns a recognition model with very few (e. g., 1 or 5) images per category, and current few-shot learning methods focus on improving the average accuracy over many episodes.
no code implementations • 18 Feb 2022 • Guo-Hua Wang, Jianxin Wu
However, they lack theoretical support and cannot explain why predictions are good candidates for pseudo-labels in the deep learning paradigm.
no code implementations • 17 Feb 2022 • Kun Yi, Guo-Hua Wang, Jianxin Wu
It is easy to collect a dataset with noisy labels, but such noise makes networks overfit seriously and accuracies drop dramatically.
1 code implementation • CVPR 2023 • Guo-Hua Wang, Jianxin Wu
Previous methods mainly adopt filter-level pruning to accelerate networks with scarce training samples.
1 code implementation • 16 Feb 2022 • Chenlin Zhang, Jianxin Wu, Yin Li
Self-attention based Transformer models have demonstrated impressive results for image classification and object detection, and more recently for video understanding.
Ranked #2 on audio-visual event localization on UnAV-100
2 code implementations • 26 Jan 2022 • Yun-Hao Cao, Hao Yu, Jianxin Wu
Vision Transformers (ViTs) is emerging as an alternative to convolutional neural networks (CNNs) for visual recognition.
1 code implementation • CVPR 2022 • Huanyu Wang, Junjie Liu, Xin Ma, Yang Yong, Zhenhua Chai, Jianxin Wu
Hence, previous methods optimize the compressed model layer-by-layer and try to make every layer have the same outputs as the corresponding layer in the teacher model, which is cumbersome.
1 code implementation • 30 Nov 2021 • Hao Yu, Jianxin Wu
Recently, vision transformer (ViT) and its variants have achieved promising performances in various computer vision tasks.
no code implementations • 11 Nov 2021 • Xiu-Shen Wei, Yi-Zhe Song, Oisin Mac Aodha, Jianxin Wu, Yuxin Peng, Jinhui Tang, Jian Yang, Serge Belongie
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications.
1 code implementation • ICCV 2021 • Zeren Sun, Yazhou Yao, Xiu-Shen Wei, Yongshun Zhang, Fumin Shen, Jianxin Wu, Jian Zhang, Heng-Tao Shen
Learning from the web can ease the extreme dependence of deep learning on large-scale manually labeled datasets.
5 code implementations • ICCV 2021 • Ke Zhu, Jianxin Wu
Multi-label image recognition is a challenging computer vision task of practical use.
Ranked #1 on Multi-Label Image Classification on VOC2007
Multi-Label Image Classification Multi-Label Image Recognition
no code implementations • 3 Aug 2021 • Chen-Lin Zhang, Yin Li, Jianxin Wu
Modern deep learning models require large amounts of accurately annotated data, which is often difficult to satisfy.
1 code implementation • 17 Jun 2021 • Yun-Hao Cao, Jianxin Wu
That is, a CNN has an inductive bias to naturally focus on objects, named as Tobias ("The object is at sight") in this paper.
no code implementations • CVPR 2022 • Lin Sui, Chen-Lin Zhang, Jianxin Wu
However, the lack of bounding-box supervision makes its accuracy much lower than fully supervised object detection (FSOD), and currently modern FSOD techniques cannot be applied to WSOD.
2 code implementations • Association for the Advancement of Artificial Intelligence 2021 • Yongshun Zhang, Xiu-Shen Wei, Boyan Zhou, Jianxin Wu
In recent years, visual recognition on challenging long-tailed distributions, where classes often exhibit extremely imbalanced frequencies, has made great progress mostly based on various complex paradigms (e. g., meta learning).
no code implementations • 28 Mar 2021 • Yifan Zhou, Yifan Ge, Jianxin Wu
Learning from examples with noisy labels has attracted increasing attention recently.
1 code implementation • ICCV 2021 • Yin-Yin He, Jianxin Wu, Xiu-Shen Wei
We tackle the long-tailed visual recognition problem from the knowledge distillation perspective by proposing a Distill the Virtual Examples (DiVE) method.
Ranked #22 on Long-tail Learning on iNaturalist 2018
1 code implementation • 25 Mar 2021 • Yun-Hao Cao, Jianxin Wu
Self-supervised learning (SSL), in particular contrastive learning, has made great progress in recent years.
1 code implementation • 12 Jan 2021 • Hao Yu, Huanyu Wang, Jianxin Wu
In this paper, we find that mixup constantly explores the representation space, and inspired by the exploration-exploitation dilemma in reinforcement learning, we propose mixup Without hesitation (mWh), a concise, effective, and easy-to-use training algorithm.
3 code implementations • 3 Nov 2020 • Guo-Hua Wang, Yifan Ge, Jianxin Wu
We argue that the teacher should give more freedom to the student feature's magnitude, and let the student pay more attention on mimicking the feature direction.
Ranked #1 on Knowledge Distillation on MS COCO (mAP metric)
1 code implementation • 3 Jul 2020 • Bin-Bin Gao, Xin-Xin Liu, Hong-Yu Zhou, Jianxin Wu, Xin Geng
The effectiveness of our approach has been demonstrated on both facial age and attractiveness estimation tasks.
Ranked #1 on Attractiveness Estimation on CFD
1 code implementation • CVPR 2020 • Chen-Lin Zhang, Yun-Hao Cao, Jianxin Wu
Weakly supervised object localization (WSOL) aims to localize objects with only image-level labels.
Ranked #2 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric)
1 code implementation • CVPR 2020 • Jian-Hao Luo, Jianxin Wu
Knowledge distillation is an effective approach to compensate for the weakness of limited data.
1 code implementation • 18 Nov 2019 • Yun-Hao Cao, Jianxin Wu, Hanchen Wang, Joan Lasenby
The random subspace method, known as the pillar of random forests, is good at making precise and robust predictions.
1 code implementation • 9 Aug 2019 • Guo-Hua Wang, Jianxin Wu
In this paper, we propose a principled end-to-end framework named deep decipher (D2) for SSL.
1 code implementation • 6 Jul 2019 • Xiu-Shen Wei, Jianxin Wu, Quan Cui
Among various research areas of CV, fine-grained image analysis (FGIA) is a longstanding and fundamental problem, and has become ubiquitous in diverse real-world applications.
no code implementations • 17 Jun 2019 • Chen-Lin Zhang, Xin-Xin Liu, Jianxin Wu
We show that pre-trained weights on ImageNet improve the accuracy under the real-time action recognition setting.
3 code implementations • CVPR 2019 • Kun Yi, Jianxin Wu
Deep learning has achieved excellent performance in various computer vision tasks, but requires a lot of training examples with clean labels.
Ranked #25 on Image Classification on Clothing1M (using extra training data)
no code implementations • 13 Dec 2018 • Hong-Yu Zhou, Avital Oliver, Jianxin Wu, Yefeng Zheng
While practitioners have had an intuitive understanding of these observations, we do a comprehensive emperical analysis and demonstrate that: (1) the gains from SSL techniques over a fully-supervised baseline are smaller when trained from a pre-trained model than when trained from random initialization, (2) when the domain of the source data used to train the pre-trained model differs significantly from the domain of the target task, the gains from SSL are significantly higher and (3) some SSL methods are able to advance fully-supervised baselines (like Pseudo-Label).
no code implementations • 11 Dec 2018 • Xiu-Shen Wei, Chen-Lin Zhang, Lingqiao Liu, Chunhua Shen, Jianxin Wu
Inspired by the coarse-to-fine hierarchical process, we propose an end-to-end RNN-based Hierarchical Attention (RNN-HA) classification model for vehicle re-identification.
1 code implementation • 13 Jul 2018 • Bin-Bin Gao, Hong-Yu Zhou, Jianxin Wu, Xin Geng
Age estimation performance has been greatly improved by using convolutional neural network.
no code implementations • 23 May 2018 • Jian-Hao Luo, Jianxin Wu
Previous filter pruning algorithms regard channel pruning and model fine-tuning as two independent steps.
1 code implementation • 11 May 2018 • Xiu-Shen Wei, Peng Wang, Lingqiao Liu, Chunhua Shen, Jianxin Wu
To solve this problem, we propose an end-to-end trainable deep network which is inspired by the state-of-the-art fine-grained recognition model and is tailored for the FSFG task.
no code implementations • 17 Apr 2018 • Chen-Wei Xie, Hong-Yu Zhou, Jianxin Wu
To be specific, our approach outperforms the previous state-of-the-art model named DeepLab v3 by 1. 5% on the PASCAL VOC 2012 val set and 0. 6% on the test set by replacing the Atrous Spatial Pyramid Pooling (ASPP) module in DeepLab v3 with the proposed Vortex Pooling.
no code implementations • 8 Mar 2018 • Jianxin Wu, Jian-Hao Luo
Although traditionally binary visual representations are mainly designed to reduce computational and storage costs in the image retrieval research, this paper argues that binary visual representations can be applied to large scale recognition and detection problems in addition to hashing in retrieval.
no code implementations • 20 Nov 2017 • Weiyao Lin, Yang Mi, Jianxin Wu, Ke Lu, Hongkai Xiong
In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams.
2 code implementations • 22 Sep 2017 • Wenhao Zheng, Hong-Yu Zhou, Ming Li, Jianxin Wu
Appropriate comments of code snippets provide insight for code functionality, which are helpful for program comprehension.
no code implementations • ICCV 2017 • Jian-Hao Luo, Jianxin Wu, Weiyao Lin
Similar experiments with ResNet-50 reveal that even for a compact network, ThiNet can also reduce more than half of the parameters and FLOPs, at the cost of roughly 1$\%$ top-5 accuracy drop.
no code implementations • 20 Jul 2017 • Hong-Yu Zhou, Bin-Bin Gao, Jianxin Wu
The difficulty of image recognition has gradually increased from general category recognition to fine-grained recognition and to the recognition of some subtle attributes such as temperature and geolocation.
no code implementations • ICCV 2017 • Hong-Yu Zhou, Bin-Bin Gao, Jianxin Wu
In this paper, we propose Adaptive Feeding (AF) to combine a fast (but less accurate) detector and an accurate (but slow) detector, by adaptively determining whether an image is easy or hard and choosing an appropriate detector for it.
no code implementations • 20 Jul 2017 • Xiu-Shen Wei, Chen-Lin Zhang, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou
Reusable model design becomes desirable with the rapid expansion of computer vision and machine learning applications.
Ranked #11 on Single-object discovery on COCO_20k
no code implementations • 19 Jun 2017 • Jian-Hao Luo, Jianxin Wu
Experiments on the ILSVRC-12 benchmark demonstrate the effectiveness of our method.
no code implementations • 8 May 2017 • Xiu-Shen Wei, Chen-Lin Zhang, Yao Li, Chen-Wei Xie, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou
Reusable model design becomes desirable with the rapid expansion of machine learning applications.
no code implementations • 20 Mar 2017 • Weiyao Lin, Yang shen, Junchi Yan, Mingliang Xu, Jianxin Wu, Jingdong Wang, Ke Lu
We first introduce a boosting-based approach to learn a correspondence structure which indicates the patch-wise matching probabilities between images from a target camera pair.
2 code implementations • 6 Nov 2016 • Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, Xin Geng
However, it is difficult to collect sufficient training images with precise labels in some domains such as apparent age estimation, head pose estimation, multi-label classification and semantic segmentation.
Ranked #1 on Head Pose Estimation on BJUT-3D
no code implementations • 10 Sep 2016 • Weiyao Lin, Yang Zhou, Hongteng Xu, Junchi Yan, Mingliang Xu, Jianxin Wu, Zicheng Liu
Our approach first leverages the complete information from given trajectories to construct a thermal transfer field which provides a context-rich way to describe the global motion pattern in a scene.
no code implementations • 24 May 2016 • Jianxin Wu, Chen-Wei Xie, Jian-Hao Luo
Large receptive field and dense prediction are both important for achieving high accuracy in pixel labeling tasks such as semantic segmentation.
no code implementations • 23 May 2016 • Xiu-Shen Wei, Chen-Wei Xie, Jianxin Wu
Fine-grained image recognition is a challenging computer vision problem, due to the small inter-class variations caused by highly similar subordinate categories, and the large intra-class variations in poses, scales and rotations.
1 code implementation • 18 Apr 2016 • Xiu-Shen Wei, Jian-Hao Luo, Jianxin Wu, Zhi-Hua Zhou
Moreover, on general image retrieval datasets, SCDA achieves comparable retrieval results with state-of-the-art general image retrieval approaches.
no code implementations • 31 Mar 2016 • Guo-Bing Zhou, Jianxin Wu, Chen-Lin Zhang, Zhi-Hua Zhou
Recently recurrent neural networks (RNN) has been very successful in handling sequence data.
no code implementations • 22 Feb 2016 • Guosheng Lin, Fayao Liu, Chunhua Shen, Jianxin Wu, Heng Tao Shen
Our column generation based method can be further generalized from the triplet loss to a general structured learning based framework that allows one to directly optimize multivariate performance measures.
no code implementations • 16 Feb 2016 • Weiyao Lin, Yang Mi, Weiyue Wang, Jianxin Wu, Jingdong Wang, Tao Mei
These semantic regions can be used to recognize pre-defined activities in crowd scenes.
1 code implementation • ICCV 2015 • Yang Shen, Weiyao Lin, Junchi Yan, Mingliang Xu, Jianxin Wu, Jingdong Wang
This paper addresses the problem of handling spatial misalignments due to camera-view changes or human-pose variations in person re-identification.
no code implementations • CVPR 2016 • Hao Yang, Joey Tianyi Zhou, Yu Zhang, Bin-Bin Gao, Jianxin Wu, Jianfei Cai
With strong labels, our framework is able to achieve state-of-the-art results in both datasets.
Ranked #17 on Multi-Label Classification on PASCAL VOC 2007
no code implementations • 21 Apr 2015 • Bin-Bin Gao, Xiu-Shen Wei, Jianxin Wu, Weiyao Lin
In this paper we show that by carefully making good choices for various detailed but important factors in a visual recognition framework using deep learning features, one can achieve a simple, efficient, yet highly accurate image classification system.
no code implementations • 20 Apr 2015 • Yu Zhang, Xiu-Shen Wei, Jianxin Wu, Jianfei Cai, Jiangbo Lu, Viet-Anh Nguyen, Minh N. Do
Most existing works heavily rely on object / part detectors to build the correspondence between object parts by using object or object part annotations inside training images.
no code implementations • 19 Apr 2015 • Jianxin Wu, Bin-Bin Gao, Guoqing Liu
In computer vision, an entity such as an image or video is often represented as a set of instance vectors, which can be SIFT, motion, or deep learning feature vectors extracted from different parts of that entity.
no code implementations • 21 Feb 2015 • Weiyao Lin, Hang Chu, Jianxin Wu, Bin Sheng, Zhenzhong Chen
In this paper, a new heat-map-based (HMB) algorithm is proposed for group activity recognition.
no code implementations • 21 Feb 2015 • Weiyao Lin, Yuanzhe Chen, Jianxin Wu, Hanli Wang, Bin Sheng, Hongxiang Li
Based on this network, we further model people in the scene as packages while human activities can be modeled as the process of package transmission in the network.
no code implementations • 4 Jul 2014 • Guosheng Lin, Chunhua Shen, Jianxin Wu
Hashing has proven a valuable tool for large-scale information retrieval.
no code implementations • CVPR 2014 • Jianxin Wu, Yu Zhang, Weiyao Lin
High dimensional representations such as VLAD or FV have shown excellent accuracy in action recognition.
no code implementations • CVPR 2014 • Yu Zhang, Jianxin Wu, Jianfei Cai
In spite of the popularity of various feature compression methods, this paper argues that feature selection is a better choice than feature compression.