Search Results for author: Animesh Jain

Found 7 papers, 3 papers with code

Automatic Attention Pruning: Improving and Automating Model Pruning using Attentions

1 code implementation • 14 Mar 2023 • Kaiqi Zhao, Animesh Jain, Ming Zhao

Then, it proposes adaptive pruning policies for automatically meeting the pruning objectives of accuracy-critical, memory-constrained, and latency-sensitive tasks.

Paper
Code

Iterative Activation-based Structured Pruning

no code implementations • 22 Jan 2022 • Kaiqi Zhao, Animesh Jain, Ming Zhao

To solve this problem, we propose two activation-based pruning methods, Iterative Activation-based Pruning (IAP) and Adaptive Iterative Activation-based Pruning (AIAP).

Paper
Add Code

Adaptive Activation-based Structured Pruning

1 code implementation • 21 Jan 2022 • Kaiqi Zhao, Animesh Jain, Ming Zhao

Pruning is a promising approach to compress complex deep learning models in order to deploy them on resource-constrained edge devices.

Paper
Code

Automated Backend-Aware Post-Training Quantization

no code implementations • 27 Mar 2021 • Ziheng Jiang, Animesh Jain, Andrew Liu, Josh Fromm, Chengqian Ma, Tianqi Chen, Luis Ceze

Quantization is a key technique to reduce the resource requirement and improve the performance of neural network deployment.

Quantization

Paper
Add Code

UNIT: Unifying Tensorized Instruction Compilation

no code implementations • 21 Jan 2021 • Jian Weng, Animesh Jain, Jie Wang, Leyuan Wang, Yida Wang, Tony Nowatzki

However, it is hard to leverage mixed precision without hardware support because of the overhead of data casting.

Paper
Add Code

Efficient Execution of Quantized Deep Learning Models: A Compiler Approach

no code implementations • 18 Jun 2020 • Animesh Jain, Shoubhik Bhattacharya, Masahiro Masuda, Vin Sharma, Yida Wang

A deep learning compiler such as Apache TVM can enable the efficient execution of model from various frameworks on various targets.

Quantization

Paper
Add Code

Optimizing Memory-Access Patterns for Deep Learning Accelerators

1 code implementation • 27 Feb 2020 • Hongbin Zheng, Sejong Oh, Huiqing Wang, Preston Briggs, Jiading Gai, Animesh Jain, Yizhi Liu, Rich Heaton, Randy Huang, Yida Wang

Deep learning (DL) workloads are moving towards accelerators for faster processing and lower cost.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.