Search Results for author: Hongxu Yin

Found 23 papers, 6 papers with code

Structural Pruning via Latency-Saliency Knapsack

1 code implementation13 Oct 2022 Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget on targeting device.

Global Context Vision Transformers

7 code implementations20 Jun 2022 Ali Hatamizadeh, Hongxu Yin, Jan Kautz, Pavlo Molchanov

We propose global context vision transformer (GC ViT), a novel architecture that enhances parameter and compute utilization for computer vision tasks.

Image Classification Inductive Bias +3

GradViT: Gradient Inversion of Vision Transformers

no code implementations CVPR 2022 Ali Hatamizadeh, Hongxu Yin, Holger Roth, Wenqi Li, Jan Kautz, Daguang Xu, Pavlo Molchanov

In this work we demonstrate the vulnerability of vision transformers (ViTs) to gradient-based inversion attacks.

Scheduling

AdaViT: Adaptive Tokens for Efficient Vision Transformer

no code implementations CVPR 2022 Hongxu Yin, Arash Vahdat, Jose Alvarez, Arun Mallya, Jan Kautz, Pavlo Molchanov

A-ViT achieves this by automatically reducing the number of tokens in vision transformers that are processed in the network as inference proceeds.

Image Classification

When to Prune? A Policy towards Early Structural Pruning

no code implementations CVPR 2022 Maying Shen, Pavlo Molchanov, Hongxu Yin, Jose M. Alvarez

Through extensive experiments on ImageNet, we show that EPI empowers a quick tracking of early training epochs suitable for pruning, offering same efficacy as an otherwise ``oracle'' grid-search that scans through epochs and requires orders of magnitude more compute.

Network Pruning

HALP: Hardware-Aware Latency Pruning

1 code implementation20 Oct 2021 Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jianna Liu, Jose M. Alvarez

We propose Hardware-Aware Latency Pruning (HALP) that formulates structural pruning as a global resource allocation optimization problem, aiming at maximizing the accuracy while constraining latency under a predefined budget.

NViT: Vision Transformer Compression and Parameter Redistribution

no code implementations10 Oct 2021 Huanrui Yang, Hongxu Yin, Pavlo Molchanov, Hai Li, Jan Kautz

On ImageNet-1K, we prune the DEIT-Base (Touvron et al., 2021) model to a 2. 6x FLOPs reduction, 5. 1x parameter reduction, and 1. 9x run-time speedup with only 0. 07% loss in accuracy.

Hardware-Aware Network Transformation

no code implementations29 Sep 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

In the second phase, it solves the combinatorial selection of efficient operations using a novel constrained integer linear optimization approach.

Neural Architecture Search

Privacy Vulnerability of Split Computing to Data-Free Model Inversion Attacks

no code implementations13 Jul 2021 Xin Dong, Hongxu Yin, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov, H. T. Kung

Prior works usually assume that SC offers privacy benefits as only intermediate features, instead of private data, are shared from devices to the cloud.

LANA: Latency Aware Network Acceleration

no code implementations12 Jul 2021 Pavlo Molchanov, Jimmy Hall, Hongxu Yin, Jan Kautz, Nicolo Fusi, Arash Vahdat

We analyze three popular network architectures: EfficientNetV1, EfficientNetV2 and ResNeST, and achieve accuracy improvement for all models (up to $3. 0\%$) when compressing larger models to the latency level of smaller models.

Neural Architecture Search Quantization

Optimal Quantization Using Scaled Codebook

no code implementations CVPR 2021 Yerlan Idelbayev, Pavlo Molchanov, Maying Shen, Hongxu Yin, Miguel A. Carreira-Perpinan, Jose M. Alvarez

We study the problem of quantizing N sorted, scalar datapoints with a fixed codebook containing K entries that are allowed to be rescaled.

Quantization

See through Gradients: Image Batch Recovery via GradInversion

2 code implementations CVPR 2021 Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M. Alvarez, Jan Kautz, Pavlo Molchanov

In this work, we introduce GradInversion, using which input images from a larger batch (8 - 48 images) can also be recovered for large networks such as ResNets (50 layers), on complex datasets such as ImageNet (1000 classes, 224x224 px).

Federated Learning Inference Attack +1

MHDeep: Mental Health Disorder Detection System based on Body-Area and Deep Neural Networks

no code implementations20 Feb 2021 Shayan Hassantabar, Joe Zhang, Hongxu Yin, Niraj K. Jha

At the patient level, MHDeep DNNs achieve an accuracy of 100%, 100%, and 90. 0% for the three mental health disorders, respectively.

Synthetic Data Generation

Fully Dynamic Inference with Deep Neural Networks

no code implementations29 Jul 2020 Wenhan Xia, Hongxu Yin, Xiaoliang Dai, Niraj K. Jha

Modern deep neural networks are powerful and widely applicable models that extract task-relevant information through multi-level abstraction.

Self-Driving Cars

Efficient Synthesis of Compact Deep Neural Networks

no code implementations18 Apr 2020 Wenhan Xia, Hongxu Yin, Niraj K. Jha

These large, deep models are often unsuitable for real-world applications, due to their massive computational cost, high memory bandwidth, and long latency.

Autonomous Driving

DiabDeep: Pervasive Diabetes Diagnosis based on Wearable Medical Sensors and Efficient Neural Networks

no code implementations11 Oct 2019 Hongxu Yin, Bilal Mukadam, Xiaoliang Dai, Niraj K. Jha

For server (edge) side inference, we achieve a 96. 3% (95. 3%) accuracy in classifying diabetics against healthy individuals, and a 95. 7% (94. 6%) accuracy in distinguishing among type-1/type-2 diabetic, and healthy individuals.

Incremental Learning Using a Grow-and-Prune Paradigm with Efficient Neural Networks

no code implementations27 May 2019 Xiaoliang Dai, Hongxu Yin, Niraj K. Jha

Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications.

Incremental Learning

Hardware-Guided Symbiotic Training for Compact, Accurate, yet Execution-Efficient LSTM

no code implementations30 Jan 2019 Hongxu Yin, Guoyang Chen, Yingmin Li, Shuai Che, Weifeng Zhang, Niraj K. Jha

In this work, we propose a hardware-guided symbiotic training methodology for compact, accurate, yet execution-efficient inference models.

Language Modelling Neural Network Compression +2

ChamNet: Towards Efficient Network Design through Platform-Aware Model Adaptation

1 code implementation CVPR 2019 Xiaoliang Dai, Peizhao Zhang, Bichen Wu, Hongxu Yin, Fei Sun, Yanghan Wang, Marat Dukhan, Yunqing Hu, Yiming Wu, Yangqing Jia, Peter Vajda, Matt Uyttendaele, Niraj K. Jha

We formulate platform-aware NN architecture search in an optimization framework and propose a novel algorithm to search for optimal architectures aided by efficient accuracy and resource (latency and/or energy) predictors.

Neural Architecture Search

Grow and Prune Compact, Fast, and Accurate LSTMs

no code implementations30 May 2018 Xiaoliang Dai, Hongxu Yin, Niraj K. Jha

To address these problems, we propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM's original one level non-linear control gates.

Image Captioning speech-recognition +1

NeST: A Neural Network Synthesis Tool Based on a Grow-and-Prune Paradigm

no code implementations6 Nov 2017 Xiaoliang Dai, Hongxu Yin, Niraj K. Jha

To address these problems, we introduce a network growth algorithm that complements network pruning to learn both weights and compact DNN architectures during training.

Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.