no code implementations • 17 Apr 2024 • Zhenhua Liu, Zhiwei Hao, Kai Han, Yehui Tang, Yunhe Wang
In this paper, by systematically investigating the impact of different training ingredients, we introduce a strong training strategy for compact models.
1 code implementation • 27 Feb 2024 • Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wang
In this paper, we propose the SAM-DiffSR model, which can utilize the fine-grained structure information from SAM in the process of sampling noise to improve the image quality without additional computational cost during inference.
1 code implementation • 7 Feb 2024 • Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xu
Training general-purpose vision models on purely sequential visual data, eschewing linguistic inputs, has heralded a new frontier in visual understanding.
1 code implementation • NeurIPS 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Yehui Tang, Han Hu, Yunhe Wang, Chang Xu
To tackle the challenge in distilling heterogeneous models, we propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.
no code implementations • 10 Sep 2023 • Guanyu Xu, Zhiwei Hao, Yong Luo, Han Hu, Jianping An, Shiwen Mao
Our objective is to achieve fast and energy-efficient collaborative inference while maintaining comparable accuracy compared with large ViTs.
1 code implementation • 25 May 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Han Hu, Chang Xu, Yunhe Wang
The tremendous success of large models trained on extensive datasets demonstrates that scale is a key ingredient in achieving superior results.
1 code implementation • 24 May 2022 • Zhiwei Hao, Guanyu Xu, Yong Luo, Han Hu, Jianping An, Shiwen Mao
In this paper, we study the multi-agent collaborative inference scenario, where a single edge server coordinates the inference of multiple UEs.
1 code implementation • 24 May 2022 • Zhiwei Hao, Yong Luo, Zhi Wang, Han Hu, Jianping An
To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module.
1 code implementation • 3 Jul 2021 • Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang
Specifically, we train a tiny student model to match a pre-trained teacher model in the patch-level manifold space.