Search Results for author: Animesh Jain

Found 7 papers, 3 papers with code

Automatic Attention Pruning: Improving and Automating Model Pruning using Attentions

1 code implementation14 Mar 2023 Kaiqi Zhao, Animesh Jain, Ming Zhao

Then, it proposes adaptive pruning policies for automatically meeting the pruning objectives of accuracy-critical, memory-constrained, and latency-sensitive tasks.

Iterative Activation-based Structured Pruning

no code implementations22 Jan 2022 Kaiqi Zhao, Animesh Jain, Ming Zhao

To solve this problem, we propose two activation-based pruning methods, Iterative Activation-based Pruning (IAP) and Adaptive Iterative Activation-based Pruning (AIAP).

Adaptive Activation-based Structured Pruning

1 code implementation21 Jan 2022 Kaiqi Zhao, Animesh Jain, Ming Zhao

Pruning is a promising approach to compress complex deep learning models in order to deploy them on resource-constrained edge devices.

Automated Backend-Aware Post-Training Quantization

no code implementations27 Mar 2021 Ziheng Jiang, Animesh Jain, Andrew Liu, Josh Fromm, Chengqian Ma, Tianqi Chen, Luis Ceze

Quantization is a key technique to reduce the resource requirement and improve the performance of neural network deployment.

Quantization

UNIT: Unifying Tensorized Instruction Compilation

no code implementations21 Jan 2021 Jian Weng, Animesh Jain, Jie Wang, Leyuan Wang, Yida Wang, Tony Nowatzki

However, it is hard to leverage mixed precision without hardware support because of the overhead of data casting.

Efficient Execution of Quantized Deep Learning Models: A Compiler Approach

no code implementations18 Jun 2020 Animesh Jain, Shoubhik Bhattacharya, Masahiro Masuda, Vin Sharma, Yida Wang

A deep learning compiler such as Apache TVM can enable the efficient execution of model from various frameworks on various targets.

Quantization

Optimizing Memory-Access Patterns for Deep Learning Accelerators

1 code implementation27 Feb 2020 Hongbin Zheng, Sejong Oh, Huiqing Wang, Preston Briggs, Jiading Gai, Animesh Jain, Yizhi Liu, Rich Heaton, Randy Huang, Yida Wang

Deep learning (DL) workloads are moving towards accelerators for faster processing and lower cost.

Cannot find the paper you are looking for? You can Submit a new open access paper.