no code implementations • ECCV 2020 • Zhijian Liu, Zhanghao Wu, Chuang Gan, Ligeng Zhu, Song Han
Third, our solution is extit{efficient} on the edge since the majority of the workload is delegated to the cloud, and our mixing and de-mixing processes introduce very few extra computations.
1 code implementation • 30 Jun 2022 • Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han
To reduce the memory footprint, we propose Sparse Update to skip the gradient computation of less important layers and sub-tensors.
no code implementations • 25 Apr 2022 • Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition.
no code implementations • NeurIPS 2021 • Ligeng Zhu, Hongzhou Lin, Yao Lu, Yujun Lin, Song Han
Federated Learning is an emerging direction in distributed machine learning that en-ables jointly training a model without sharing the data.
1 code implementation • 2 Nov 2020 • Yaoyao Ding, Ligeng Zhu, Zhihao Jia, Gennady Pekhimenko, Song Han
To accelerate CNN inference, existing deep learning frameworks focus on optimizing intra-operator parallelization.
1 code implementation • NeurIPS 2020 • Han Cai, Chuang Gan, Ligeng Zhu, Song Han
Furthermore, combined with feature extractor adaptation, TinyTL provides 7. 3-12. 9x memory saving without sacrificing accuracy compared to fine-tuning the full Inception-V3.
4 code implementations • ACL 2020 • Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han
To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search.
Ranked #21 on
Machine Translation
on WMT2014 English-French
no code implementations • 25 Sep 2019 • Ligeng Zhu, Yao Lu, Yujun Lin, Song Han
Traditional synchronous distributed training is performed inside a cluster, since it requires high bandwidth and low latency network (e. g. 25Gb Ethernet or Infini-band).
5 code implementations • NeurIPS 2019 • Ligeng Zhu, Zhijian Liu, Song Han
Exchanging gradients is a widely used method in modern multi-node machine learning system (e. g., distributed training, collaborative learning).
no code implementations • 24 Apr 2019 • Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin
Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.
18 code implementations • ICLR 2019 • Han Cai, Ligeng Zhu, Song Han
We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set.
Ranked #6 on
Neural Architecture Search
on CIFAR-10 Image Classification
(using extra training data)
2 code implementations • ECCV 2018 • Ligeng Zhu, Ruizhi Deng, Michael Maire, Zhiwei Deng, Greg Mori, Ping Tan
We explore a key architectural aspect of deep convolutional neural networks: the pattern of internal skip connections used to aggregate outputs of earlier layers for consumption by deeper layers.
no code implementations • 5 Dec 2017 • Mengyao Zhai, Jiacheng Chen, Ruizhi Deng, Lei Chen, Ligeng Zhu, Greg Mori
An architecture combining a hierarchical temporal model for predicting human poses and encoder-decoder convolutional neural networks for rendering target appearances is proposed.