Search Results for author: Ligeng Zhu

Found 13 papers, 7 papers with code

DataMix: Efficient Privacy-Preserving Edge-Cloud Inference

no code implementations ECCV 2020 Zhijian Liu, Zhanghao Wu, Chuang Gan, Ligeng Zhu, Song Han

Third, our solution is extit{efficient} on the edge since the majority of the workload is delegated to the cloud, and our mixing and de-mixing processes introduce very few extra computations.

Privacy Preserving speech-recognition +1

On-Device Training Under 256KB Memory

1 code implementation30 Jun 2022 Ji Lin, Ligeng Zhu, Wei-Ming Chen, Wei-Chen Wang, Chuang Gan, Song Han

To reduce the memory footprint, we propose Sparse Update to skip the gradient computation of less important layers and sub-tensors.

Quantization Transfer Learning

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

no code implementations25 Apr 2022 Han Cai, Ji Lin, Yujun Lin, Zhijian Liu, Haotian Tang, Hanrui Wang, Ligeng Zhu, Song Han

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition.

Model Compression Neural Architecture Search +3

Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning

no code implementations NeurIPS 2021 Ligeng Zhu, Hongzhou Lin, Yao Lu, Yujun Lin, Song Han

Federated Learning is an emerging direction in distributed machine learning that en-ables jointly training a model without sharing the data.

Federated Learning

IOS: Inter-Operator Scheduler for CNN Acceleration

1 code implementation2 Nov 2020 Yaoyao Ding, Ligeng Zhu, Zhihao Jia, Gennady Pekhimenko, Song Han

To accelerate CNN inference, existing deep learning frameworks focus on optimizing intra-operator parallelization.

TinyTL: Reduce Activations, Not Trainable Parameters for Efficient On-Device Learning

1 code implementation NeurIPS 2020 Han Cai, Chuang Gan, Ligeng Zhu, Song Han

Furthermore, combined with feature extractor adaptation, TinyTL provides 7. 3-12. 9x memory saving without sacrificing accuracy compared to fine-tuning the full Inception-V3.

Transfer Learning

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

4 code implementations ACL 2020 Hanrui Wang, Zhanghao Wu, Zhijian Liu, Han Cai, Ligeng Zhu, Chuang Gan, Song Han

To enable low-latency inference on resource-constrained hardware platforms, we propose to design Hardware-Aware Transformers (HAT) with neural architecture search.

Machine Translation Neural Architecture Search +1

Distributed Training Across the World

no code implementations25 Sep 2019 Ligeng Zhu, Yao Lu, Yujun Lin, Song Han

Traditional synchronous distributed training is performed inside a cluster, since it requires high bandwidth and low latency network (e. g. 25Gb Ethernet or Infini-band).

Deep Leakage from Gradients

5 code implementations NeurIPS 2019 Ligeng Zhu, Zhijian Liu, Song Han

Exchanging gradients is a widely used method in modern multi-node machine learning system (e. g., distributed training, collaborative learning).

Design Automation for Efficient Deep Learning Computing

no code implementations24 Apr 2019 Song Han, Han Cai, Ligeng Zhu, Ji Lin, Kuan Wang, Zhijian Liu, Yujun Lin

Moreover, we shorten the design cycle by 200x than previous work, so that we can afford to design specialized neural network models for different hardware platforms.


ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

18 code implementations ICLR 2019 Han Cai, Ligeng Zhu, Song Han

We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set.

Image Classification Neural Architecture Search

Sparsely Aggregated Convolutional Networks

2 code implementations ECCV 2018 Ligeng Zhu, Ruizhi Deng, Michael Maire, Zhiwei Deng, Greg Mori, Ping Tan

We explore a key architectural aspect of deep convolutional neural networks: the pattern of internal skip connections used to aggregate outputs of earlier layers for consumption by deeper layers.

Learning to Forecast Videos of Human Activity with Multi-granularity Models and Adaptive Rendering

no code implementations5 Dec 2017 Mengyao Zhai, Jiacheng Chen, Ruizhi Deng, Lei Chen, Ligeng Zhu, Greg Mori

An architecture combining a hierarchical temporal model for predicting human poses and encoder-decoder convolutional neural networks for rendering target appearances is proposed.

Cannot find the paper you are looking for? You can Submit a new open access paper.