no code implementations • 25 Jan 2025 • Bohan Liu, Yang Xiao, Ruimeng Ye, Zinan Ling, Xiaolong Ma, Bo Hui
In this paper, we experimentally demonstrate that, while directly applying DBA to decentralized FL, the attack success rate depends on the distribution of attackers in the network architecture.
1 code implementation • 26 Nov 2024 • Mingyu Cao, Gen Li, Jie Ji, JiaQi Zhang, Xiaolong Ma, Shiwei Liu, Lu Yin
Mixture-of-Experts (MOE) has garnered significant attention for their ability to scale up neural networks while utilizing the same or even fewer active parameters.
1 code implementation • 3 Jul 2024 • Gen Li, Zhihao Shu, Jie Ji, Minghai Qin, Fatemeh Afghah, Wei Niu, Xiaolong Ma
By splitting videos into chunks and applying a super-resolution (SR) model to overfit each chunk, this scheme of SR models plus video chunks is able to replace traditional video transmission to enhance video quality and transmission efficiency.
no code implementations • 9 Apr 2024 • Hossein Rajoli, Sahand Khoshdel, Fatemeh Afghah, Xiaolong Ma
However, the dominance of center loss over the other losses leads to the model missing features sensitive to them.
no code implementations • 15 Dec 2023 • Yucong Dai, Gen Li, Feng Luo, Xiaolong Ma, Yongkai Wu
To address this, we define a fair pruning task where a sparse model is derived subject to fairness requirements.
no code implementations • 3 May 2023 • Bo Hui, Da Yan, Xiaolong Ma, Wei-Shinn Ku
Therefore, we propose two techniques to improve GNN performance when the graph sparsity is high.
2 code implementations • CVPR 2023 • Gen Li, Jie Ji, Minghai Qin, Wei Niu, Bin Ren, Fatemeh Afghah, Linke Guo, Xiaolong Ma
To reconcile such, we propose a novel method for high-quality and efficient video resolution upscaling tasks, which leverages the spatial-temporal information to accurately divide video into chunks, thus keeping the number of chunks as well as the model size to minimum.
1 code implementation • 19 Nov 2022 • Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang
Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.
1 code implementation • 2 Nov 2022 • Xuan Shen, Zhenglun Kong, Minghai Qin, Peiyan Dong, Geng Yuan, Xin Meng, Hao Tang, Xiaolong Ma, Yanzhi Wang
That is, there exists a subset of input image patches such that a ViT can be trained from scratch by using only this subset of patches and achieve similar accuracy to the ViTs trained by using all image patches.
1 code implementation • CVPR 2022 • Zejiang Hou, Minghai Qin, Fei Sun, Xiaolong Ma, Kun Yuan, Yi Xu, Yen-Kuang Chen, Rong Jin, Yuan Xie, Sun-Yuan Kung
However, conventional pruning methods have limitations in that: they are restricted to pruning process only, and they require a fully pre-trained large model.
1 code implementation • 9 Feb 2022 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.
no code implementations • 17 Jan 2022 • Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang
To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.
1 code implementation • 27 Dec 2021 • Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Mengshu Sun, Wei Niu, Xuan Shen, Geng Yuan, Bin Ren, Minghai Qin, Hao Tang, Yanzhi Wang
Moreover, our framework can guarantee the identified model to meet resource specifications of mobile devices and FPGA, and even achieve the real-time execution of DeiT-T on mobile platforms.
Ranked #4 on
Efficient ViTs
on ImageNet-1K (with DeiT-S)
no code implementations • 20 Dec 2021 • Fei Sun, Minghai Qin, Tianyun Zhang, Xiaolong Ma, Haoran Li, Junwen Luo, Zihao Zhao, Yen-Kuang Chen, Yuan Xie
Our experiments show that GS patterns consistently make better trade-offs between accuracy and computation efficiency compared to conventional structured sparse patterns.
1 code implementation • NeurIPS 2021 • Geng Yuan, Xiaolong Ma, Wei Niu, Zhengang Li, Zhenglun Kong, Ning Liu, Yifan Gong, Zheng Zhan, Chaoyang He, Qing Jin, Siyue Wang, Minghai Qin, Bin Ren, Yanzhi Wang, Sijia Liu, Xue Lin
Systematical evaluation on accuracy, training speed, and memory footprint are conducted, where the proposed MEST framework consistently outperforms representative SOTA works.
no code implementations • 29 Sep 2021 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang
The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.
no code implementations • 29 Sep 2021 • Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Mengshu Sun, Wei Niu, Bin Ren, Minghai Qin, Hao Tang, Yanzhi Wang
Recently, Vision Transformer (ViT) has continuously established new milestones in the computer vision field, while the high computation and memory cost makes its propagation in industrial production difficult.
no code implementations • 25 Aug 2021 • Wei Niu, Zhengang Li, Xiaolong Ma, Peiyan Dong, Gang Zhou, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren
It necessitates the sparse model inference via weight pruning, i. e., DNN weight sparsity, and it is desirable to design a new DNN weight sparsity scheme that can facilitate real-time inference on mobile devices while preserving a high sparse model accuracy.
no code implementations • 9 Jul 2021 • Henry Kvinge, Colby Wight, Sarah Akers, Scott Howland, Woongjo Choi, Xiaolong Ma, Luke Gosink, Elizabeth Jurrus, Keerti Kappagantula, Tegan H. Emerson
As both machine learning models and the datasets on which they are evaluated have grown in size and complexity, the practice of using a few summary statistics to understand model performance has become increasingly problematic.
2 code implementations • NeurIPS 2021 • Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang
Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.
1 code implementation • ICLR 2022 • Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie
It addresses the shortcomings of the previous works by repeatedly growing a subset of layers to dense and then pruning them back to sparse after some training.
no code implementations • 16 Jun 2021 • Geng Yuan, Zhiheng Liao, Xiaolong Ma, Yuxuan Cai, Zhenglun Kong, Xuan Shen, Jingyan Fu, Zhengang Li, Chengming Zhang, Hongwu Peng, Ning Liu, Ao Ren, Jinhui Wang, Yanzhi Wang
More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme.
no code implementations • 16 Jun 2021 • Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding
With weights stored in the ReRAM crossbar cells as conductance, when the input vector is applied to word lines, the matrix-vector multiplication results can be generated as the current in bit lines.
no code implementations • 6 Jun 2021 • Xuan Shen, Geng Yuan, Wei Niu, Xiaolong Ma, Jiexiong Guan, Zhengang Li, Bin Ren, Yanzhi Wang
The rapid development of autonomous driving, abnormal behavior detection, and behavior recognition makes an increasing demand for multi-person pose estimation-based applications, especially on mobile platforms.
no code implementations • 19 Feb 2021 • Ning Liu, Geng Yuan, Zhengping Che, Xuan Shen, Xiaolong Ma, Qing Jin, Jian Ren, Jian Tang, Sijia Liu, Yanzhi Wang
In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin, 2018) pointed out that there could exist a winning ticket (i. e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance than the original dense network.
no code implementations • 12 Apr 2020 • Tianyun Zhang, Xiaolong Ma, Zheng Zhan, Shanglin Zhou, Minghai Qin, Fei Sun, Yen-Kuang Chen, Caiwen Ding, Makan Fardad, Yanzhi Wang
To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i. e., static regularization-based pruning and dynamic regularization-based pruning.
no code implementations • 13 Mar 2020 • Yifan Gong, Zheng Zhan, Zhengang Li, Wei Niu, Xiaolong Ma, Wenhao Wang, Bin Ren, Caiwen Ding, Xue Lin, Xiao-Lin Xu, Yanzhi Wang
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
no code implementations • 11 Feb 2020 • Zhanhong Tan, Jiebo Song, Xiaolong Ma, Sia-Huat Tan, Hongyang Chen, Yuanqing Miao, Yi-Fu Wu, Shaokai Ye, Yanzhi Wang, Dehui Li, Kaisheng Ma
Weight pruning is a powerful technique to realize model compression.
no code implementations • 23 Jan 2020 • Xiaolong Ma, Zhengang Li, Yifan Gong, Tianyun Zhang, Wei Niu, Zheng Zhan, Pu Zhao, Jian Tang, Xue Lin, Bin Ren, Yanzhi Wang
Accelerating DNN execution on various resource-limited computing platforms has been a long-standing problem.
no code implementations • 23 Jan 2020 • Zhengang Li, Yifan Gong, Xiaolong Ma, Sijia Liu, Mengshu Sun, Zheng Zhan, Zhenglun Kong, Geng Yuan, Yanzhi Wang
Structured weight pruning is a representative model compression technique of DNNs for hardware efficiency and inference accelerations.
no code implementations • ECCV 2020 • Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang
Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms.
no code implementations • 1 Jan 2020 • Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren
Weight pruning of DNNs is proposed, but existing schemes represent two extremes in the design space: non-structured pruning is fine-grained, accurate, but not hardware friendly; structured pruning is coarse-grained, hardware-efficient, but with higher accuracy loss.
no code implementations • 24 Nov 2019 • Geng Yuan, Xiaolong Ma, Sheng Lin, Zhengang Li, Caiwen Ding
Thus, the footprint and power consumption of SOT-MRAM PIM can be reduced, while increasing the overall system throughput at the meantime, making our proposed ADMM-based SOT-MRAM PIM more energy efficiency and suitable for embedded systems or IoT devices.
no code implementations • 6 Sep 2019 • Xiaolong Ma, Fu-Ming Guo, Wei Niu, Xue Lin, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang
Model compression techniques on Deep Neural Network (DNN) have been widely acknowledged as an effective way to achieve acceleration on a variety of platforms, and DNN weight pruning is a straightforward and effective method.
no code implementations • 29 Aug 2019 • Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang
Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model.
no code implementations • 27 Aug 2019 • Xiaolong Ma, Geng Yuan, Sheng Lin, Caiwen Ding, Fuxun Yu, Tao Liu, Wujie Wen, Xiang Chen, Yanzhi Wang
To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework for DNN applications.
no code implementations • 6 Jul 2019 • Ning Liu, Xiaolong Ma, Zhiyuan Xu, Yanzhi Wang, Jian Tang, Jieping Ye
This work proposes AutoCompress, an automatic structured pruning framework with the following key performance improvements: (i) effectively incorporate the combination of structured pruning schemes in the automatic process; (ii) adopt the state-of-art ADMM-based structured weight pruning as the core algorithm, and propose an innovative additional purification step for further weight reduction without accuracy loss; and (iii) develop effective heuristic search method enhanced by experience-based guided search, replacing the prior deep reinforcement learning technique which has underlying incompatibility with the target pruning problem.
no code implementations • 3 Jul 2019 • Xiaolong Ma, Sheng Lin, Shaokai Ye, Zhezhi He, Linfeng Zhang, Geng Yuan, Sia Huat Tan, Zhengang Li, Deliang Fan, Xuehai Qian, Xue Lin, Kaisheng Ma, Yanzhi Wang
Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structrued pruning is not competitive in terms of both storage and computation efficiency.
no code implementations • 2 May 2019 • Wei Niu, Xiaolong Ma, Yanzhi Wang, Bin Ren
With the rapid emergence of a spectrum of high-end mobile devices, many applications that required desktop-level computation capability formerly can now run on these devices without any problem.
no code implementations • 2 May 2019 • Sheng Lin, Xiaolong Ma, Shaokai Ye, Geng Yuan, Kaisheng Ma, Yanzhi Wang
Weight quantization is one of the most important techniques of Deep Neural Networks (DNNs) model compression method.
no code implementations • 30 Apr 2019 • Xiaolong Ma, Geng Yuan, Sheng Lin, Zhengang Li, Hao Sun, Yanzhi Wang
The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources.
2 code implementations • 23 Mar 2019 • Shaokai Ye, Xiaoyu Feng, Tianyun Zhang, Xiaolong Ma, Sheng Lin, Zhengang Li, Kaidi Xu, Wujie Wen, Sijia Liu, Jian Tang, Makan Fardad, Xue Lin, Yongpan Liu, Yanzhi Wang
A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.
1 code implementation • 29 Jul 2018 • Tianyun Zhang, Shaokai Ye, Kaiqi Zhang, Xiaolong Ma, Ning Liu, Linfeng Zhang, Jian Tang, Kaisheng Ma, Xue Lin, Makan Fardad, Yanzhi Wang
Without loss of accuracy on the AlexNet model, we achieve 2. 58X and 3. 65X average measured speedup on two GPUs, clearly outperforming the prior work.
no code implementations • 28 Mar 2018 • Caiwen Ding, Ao Ren, Geng Yuan, Xiaolong Ma, Jiayu Li, Ning Liu, Bo Yuan, Yanzhi Wang
For FPGA implementations on deep convolutional neural networks (DCNNs), we achieve at least 152X and 72X improvement in performance and energy efficiency, respectively using the SWM-based framework, compared with the baseline of IBM TrueNorth processor under same accuracy constraints using the data set of MNIST, SVHN, and CIFAR-10.
no code implementations • 20 Mar 2018 • Zhe Li, Xiaolong Ma, Hongjia Li, Qiyuan An, Aditya Singh Rathore, Qinru Qiu, Wenyao Xu, Yanzhi Wang
It is of vital importance to enable 3D printers to identify the objects to be printed, so that the manufacturing procedure of an illegal weapon can be terminated at the early stage.
no code implementations • 18 Feb 2018 • Yanzhi Wang, Caiwen Ding, Zhe Li, Geng Yuan, Siyu Liao, Xiaolong Ma, Bo Yuan, Xuehai Qian, Jian Tang, Qinru Qiu, Xue Lin
Hardware accelerations of deep learning systems have been extensively investigated in industry and academia.
no code implementations • 15 Feb 2018 • Hongjia Li, Xiaolong Ma, Aditya Singh Rathore, Zhe Li, Qiyuan An, Chen Song, Wenyao Xu, Yanzhi Wang
The rapid development in additive manufacturing (AM), also known as 3D printing, has brought about potential risk and security issues along with significant benefits.
no code implementations • 3 Feb 2018 • Xiaolong Ma, Yi-Peng Zhang, Geng Yuan, Ao Ren, Zhe Li, Jie Han, Jingtong Hu, Yanzhi Wang
However, in these works, the memory design optimization is neglected for weight storage, which will inevitably result in large hardware cost.
no code implementations • 29 Aug 2017 • Caiwen Ding, Siyu Liao, Yanzhi Wang, Zhe Li, Ning Liu, Youwei Zhuo, Chao Wang, Xuehai Qian, Yu Bai, Geng Yuan, Xiaolong Ma, Yi-Peng Zhang, Jian Tang, Qinru Qiu, Xue Lin, Bo Yuan
As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy.
no code implementations • 25 Nov 2016 • Xiaolong Ma, Xiatian Zhu, Shaogang Gong, Xudong Xie, Jianming Hu, Kin-Man Lam, Yisheng Zhong
Crucially, this model does not require pairwise labelled training data (i. e. unsupervised) therefore readily scalable to large scale camera networks of arbitrary camera pairs without the need for exhaustive data annotation for every camera pair.