Search Results for author: Alind Khare

Found 10 papers, 3 papers with code

SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads

no code implementations27 Dec 2023 Alind Khare, Dhruv Garg, Sukrit Kalra, Snigdha Grandhi, Ion Stoica, Alexey Tumanov

Serving models under such conditions requires these systems to strike a careful balance between the latency and accuracy requirements of the application and the overall efficiency of utilization of scarce resources.

Scheduling

ABKD: Graph Neural Network Compression with Attention-Based Knowledge Distillation

no code implementations24 Oct 2023 Anshul Ahluwalia, Rohit Das, Payman Behnam, Alind Khare, Pan Li, Alexey Tumanov

To address this shortcoming, we propose a novel KD approach to GNN compression that we call Attention-Based Knowledge Distillation (ABKD).

Drug Discovery Fake News Detection +3

SuperFed: Weight Shared Federated Learning

no code implementations26 Jan 2023 Alind Khare, Animesh Agrawal, Myungjin Lee, Alexey Tumanov

We propose SuperFed - an architectural framework that incurs $O(1)$ cost to co-train a large family of models in a federated fashion by leveraging weight-shared learning.

Federated Learning Privacy Preserving

UnfoldML: Cost-Aware and Uncertainty-Based Dynamic 2D Prediction for Multi-Stage Classification

no code implementations26 Oct 2022 Yanbo Xu, Alind Khare, Glenn Matlin, Monish Ramadoss, Rishikesan Kamaleswaran, Chao Zhang, Alexey Tumanov

It achieves within 0. 1% accuracy from the highest-performing multi-class baseline, while saving close to 20X on spatio-temporal cost of inference and earlier (3. 5hrs) disease onset prediction.

Image Classification

CompOFA: Compound Once-For-All Networks for Faster Multi-Platform Deployment

1 code implementation26 Apr 2021 Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov

The emergence of CNNs in mainstream deployment has necessitated methods to design and train efficient architectures tailored to maximize the accuracy under diverse hardware & latency constraints.

CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment

2 code implementations ICLR 2021 Manas Sahni, Shreya Varshini, Alind Khare, Alexey Tumanov

The emergence of CNNs in mainstream deployment has necessitated methods to design and train efficient architectures tailored to maximize the accuracy under diverse hardware & latency constrains.

HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units

3 code implementations10 Aug 2020 Shenda Hong, Yanbo Xu, Alind Khare, Satria Priambada, Kevin Maher, Alaa Aljiffry, Jimeng Sun, Alexey Tumanov

HOLMES is tested on risk prediction task on pediatric cardio ICU data with above 95% prediction accuracy and sub-second latency on 64-bed simulation.

Navigate

A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training of DNNs

no code implementations25 Oct 2019 Koyel Mukherjee, Alind Khare, Ashish Verma

Training neural networks on image datasets generally require extensive experimentation to find the optimal learning rate regime.

Cannot find the paper you are looking for? You can Submit a new open access paper.