Search Results for author: Raghu Ganti

Found 5 papers, 1 papers with code

SudokuSens: Enhancing Deep Learning Robustness for IoT Sensing Applications using a Generative Approach

no code implementations3 Feb 2024 Tianshi Wang, Jinyang Li, Ruijie Wang, Denizhan Kara, Shengzhong Liu, Davis Wertheimer, Antoni Viros-i-Martin, Raghu Ganti, Mudhakar Srivatsa, Tarek Abdelzaher

To incorporate sufficient diversity into the IoT training data, one therefore needs to consider a combinatorial explosion of training cases that are multiplicative in the number of objects considered and the possible environmental conditions in which such objects may be encountered.

Contrastive Learning

TP-Aware Dequantization

no code implementations15 Jan 2024 Adnan Hoque, Mudhakar Srivatsa, Chih-Chieh Yang, Raghu Ganti

In this paper, we present a novel method that reduces model inference latency during distributed deployment of Large Language Models (LLMs).

Quantization

Accelerating a Triton Fused Kernel for W4A16 Quantized Inference with SplitK work decomposition

no code implementations5 Jan 2024 Adnan Hoque, Less Wright, Chih-Chieh Yang, Mudhakar Srivatsa, Raghu Ganti

Our implementation shows improvement for the type of skinny matrix-matrix multiplications found in foundation model inference workloads.

Cannot find the paper you are looking for? You can Submit a new open access paper.