1 code implementation • ICCV 2023 • Borui Zhao, RenJie Song, Jiajun Liang
(2) Distilling knowledge from CNN limits the network convergence in the later training period since ViT's capability of integrating global information is suppressed by CNN's local-inductive-bias supervision.
1 code implementation • ICCV 2023 • Borui Zhao, Quan Cui, RenJie Song, Jiajun Liang
In this paper, we observe a trade-off between task and distillation losses, i. e., introducing distillation loss limits the convergence of task loss.
1 code implementation • 22 May 2023 • Zheng Li, YuXuan Li, Penghai Zhao, RenJie Song, Xiang Li, Jian Yang
Diffusion models have recently achieved astonishing performance in generating high-fidelity photo-realistic images.
Ranked #2 on Few-Shot Learning on DTD
1 code implementation • 23 Mar 2023 • Xiang Li, Ge Wu, Lingfeng Yang, Wenhai Wang, RenJie Song, Jian Yang
The various types of elements, deposited in the training history, are a large amount of wealth for improving learning deep models.
1 code implementation • CVPR 2023 • Yuhao Chen, Xin Tan, Borui Zhao, Zhaowei Chen, RenJie Song, Jiajun Liang, Xuequan Lu
ANL introduces the additional negative pseudo-label for all unlabeled data to leverage low-confidence examples.
1 code implementation • 29 Nov 2022 • Zheng Li, Xiang Li, Lingfeng Yang, Borui Zhao, RenJie Song, Lei Luo, Jun Li, Jian Yang
In this paper, we propose a simple curriculum-based technique, termed Curriculum Temperature for Knowledge Distillation (CTKD), which controls the task difficulty level during the student's learning career through a dynamic and learnable temperature.
1 code implementation • CVPR 2022 • Borui Zhao, Quan Cui, RenJie Song, Yiyu Qiu, Jiajun Liang
To provide a novel viewpoint to study logit distillation, we reformulate the classical KD loss into two parts, i. e., target class knowledge distillation (TCKD) and non-target class knowledge distillation (NCKD).
1 code implementation • 14 Mar 2022 • Lingfeng Yang, Xiang Li, Borui Zhao, RenJie Song, Jian Yang
In semantic segmentation, RM also surpasses the baseline and CutMix by 1. 9 and 1. 1 mIoU points under UperNet on ADE20K, respectively.
1 code implementation • 8 Mar 2022 • Quan Cui, Bingchen Zhao, Zhao-Min Chen, Borui Zhao, RenJie Song, Jiajun Liang, Boyan Zhou, Osamu Yoshie
This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task, i. e., image classification.
1 code implementation • CVPR 2022 • Lingfeng Yang, Xiang Li, RenJie Song, Borui Zhao, Juntian Tao, Shihao Zhou, Jiajun Liang, Jian Yang
Therefore, it is helpful to leverage additional information, e. g., the locations and dates for data shooting, which can be easily accessible but rarely exploited.