Search Results for author: Souvik Kundu

Found 51 papers, 17 papers with code

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

no code implementations20 Mar 2024 Zeyu Liu, Souvik Kundu, Anni Li, Junrui Wan, Lianghao Jiang, Peter Anthony Beerel

While compared in terms of runtime, AFLoRA can yield up to $1. 86\times$ improvement as opposed to similar PEFT alternatives.

GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM

1 code implementation8 Mar 2024 Hao Kang, Qingru Zhang, Souvik Kundu, Geonhwa Jeong, Zaoxing Liu, Tushar Krishna, Tuo Zhao

Key-value (KV) caching has become the de-facto to accelerate generation speed for large language models (LLMs) inference.

Quantization

CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware

no code implementations19 Feb 2024 Souvik Kundu, Anthony Sarah, Vinay Joshi, Om J Omer, Sreenivas Subramoney

With the recent growth in demand for large-scale deep neural networks, compute in-memory (CiM) has come up as a prominent solution to alleviate bandwidth and on-chip interconnect bottlenecks that constrain Von-Neuman architectures.

Neural Architecture Search

Linearizing Models for Efficient yet Robust Private Inference

no code implementations8 Feb 2024 Sreetama Sarkar, Souvik Kundu, Peter A. Beerel

Our experimental evaluations show that RLNet can yield models with up to 11. 14x fewer ReLUs, with accuracy close to the all-ReLU models, on clean, naturally perturbed, and gradient-based perturbed images.

Sparse but Strong: Crafting Adversarially Robust Graph Lottery Tickets

no code implementations11 Dec 2023 Subhajit Dutta Chowdhury, Zhiyu Ni, Qingyuan Peng, Souvik Kundu, Pierluigi Nuzzo

By iteratively applying ARGS to prune both the perturbed graph adjacency matrix and the GNN model weights, we can find adversarially robust graph lottery tickets that are highly sparse yet achieve competitive performance under different untargeted training-time structure attacks.

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data

no code implementations7 Dec 2023 Yuhang Li, Youngeun Kim, DongHyun Lee, Souvik Kundu, Priyadarshini Panda

In the realm of deep neural network deployment, low-bit quantization presents a promising avenue for enhancing computational efficiency.

Computational Efficiency Quantization +1

Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural networks: from Algorithms to Technology

no code implementations2 Dec 2023 Souvik Kundu, Rui-Jie Zhu, Akhilesh Jaiswal, Peter A. Beerel

Neuromorphic computing and, in particular, spiking neural networks (SNNs) have become an attractive alternative to deep neural networks for a broad range of signal processing applications, processing static and/or temporal inputs from different sensory modalities, including audio and vision sensors.

Fusing Models with Complementary Expertise

no code implementations2 Oct 2023 Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research.

Multiple-choice text-classification +2

Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs

1 code implementation29 Sep 2023 Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang

Contrary to this belief, this paper presents a counter-argument: small-magnitude weights of pre-trained model weights encode vital knowledge essential for tackling difficult downstream tasks - manifested as the monotonic relationship between the performance drop of downstream tasks across the difficulty spectrum, as we prune more pre-trained weights by magnitude.

Quantization

InstaTune: Instantaneous Neural Architecture Search During Fine-Tuning

no code implementations29 Aug 2023 Sharath Nittur Sridhar, Souvik Kundu, Sairam Sundaresan, Maciej Szankin, Anthony Sarah

However, training super-networks from scratch can be extremely time consuming and compute intensive especially for large models that rely on a two-stage training process of pre-training and fine-tuning.

Neural Architecture Search

FireFly A Synthetic Dataset for Ember Detection in Wildfire

1 code implementation6 Aug 2023 Yue Hu, Xinan Ye, Yifei Liu, Souvik Kundu, Gourav Datta, Srikar Mutnuri, Namo Asavisanu, Nora Ayanian, Konstantinos Psounis, Peter Beerel

This paper presents "FireFly", a synthetic dataset for ember detection created using Unreal Engine 4 (UE4), designed to overcome the current lack of ember-specific training resources.

object-detection Object Detection

Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT

no code implementations14 Jul 2023 Souvik Kundu, Sharath Nittur Sridhar, Maciej Szankin, Sairam Sundaresan

In this paper, we present Sensi-BERT, a sensitivity driven efficient fine-tuning of BERT models that can take an off-the-shelf pre-trained BERT model and yield highly parameter-efficient models for downstream tasks.

QNLI QQP +4

NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations

1 code implementation10 Jun 2023 Yonggan Fu, Ye Yuan, Souvik Kundu, Shang Wu, Shunyao Zhang, Yingyan Lin

Generalizable Neural Radiance Fields (GNeRF) are one of the most promising real-world solutions for novel view synthesis, thanks to their cross-scene generalization capability and thus the possibility of instant rendering on new scenes.

Adversarial Robustness Novel View Synthesis

Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference

no code implementations26 Apr 2023 Souvik Kundu, Yuke Zhang, Dake Chen, Peter A. Beerel

Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference.

Model Optimization

Technology-Circuit-Algorithm Tri-Design for Processing-in-Pixel-in-Memory (P2M)

no code implementations6 Apr 2023 Md Abdullah-Al Kaiser, Gourav Datta, Sreetama Sarkar, Souvik Kundu, Zihan Yin, Manas Garg, Ajey P. Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

The massive amounts of data generated by camera sensors motivate data processing inside pixel arrays, i. e., at the extreme-edge.

ViTA: A Vision Transformer Inference Accelerator for Edge Applications

no code implementations17 Feb 2023 Shashank Nag, Gourav Datta, Souvik Kundu, Nitin Chandrachoodan, Peter A. Beerel

Vision Transformer models, such as ViT, Swin Transformer, and Transformer-in-Transformer, have recently gained significant traction in computer vision tasks due to their ability to capture the global relation between features which leads to superior performance.

Edge-computing

Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference

no code implementations23 Jan 2023 Souvik Kundu, Shunlin Lu, Yuke Zhang, Jacqueline Liu, Peter A. Beerel

For a similar ReLU budget SENet can yield models with ~2. 32% improved classification accuracy, evaluated on CIFAR-100.

Vision HGNN: An Image is More than a Graph of Nodes

1 code implementation ICCV 2023 Yan Han, Peihao Wang, Souvik Kundu, Ying Ding, Zhangyang Wang

In this paper, we enhance ViG by transcending conventional "pairwise" linkages and harnessing the power of the hypergraph to encapsulate image information.

graph construction Image Classification +2

SAL-ViT: Towards Latency Efficient Private Inference on ViT using Selective Attention Search with a Learnable Softmax Approximation

no code implementations ICCV 2023 Yuke Zhang, Dake Chen, Souvik Kundu, Chenghao Li, Peter A. Beerel

Then, given our observation that external attention (EA) presents lower PI latency than widely-adopted self-attention (SA) at the cost of accuracy, we present a selective attention search (SAS) method to integrate the strength of EA and SA.

Sparse Mixture Once-for-all Adversarial Training for Efficient In-Situ Trade-Off Between Accuracy and Robustness of DNNs

no code implementations27 Dec 2022 Souvik Kundu, Sairam Sundaresan, Sharath Nittur Sridhar, Shunlin Lu, Han Tang, Peter A. Beerel

Existing deep neural networks (DNNs) that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on either activation or weight conditioned convolution operations.

Image Classification

In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

no code implementations21 Dec 2022 Gourav Datta, Zeyu Liu, Md Abdullah-Al Kaiser, Souvik Kundu, Joe Mathai, Zihan Yin, Ajey P. Jacob, Akhilesh R. Jaiswal, Peter A. Beerel

Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy.

Total Energy

Self-Attentive Pooling for Efficient Deep Learning

no code implementations16 Sep 2022 Fang Chen, Gourav Datta, Souvik Kundu, Peter Beerel

With the aggressive down-sampling of the activation maps in the initial layers (providing up to 22x reduction in memory consumption), our approach achieves 1. 43% higher test accuracy compared to SOTA techniques with iso-memory footprints.

Dynamic Calibration of Nonlinear Sensors with Time-Drifts and Delays by Bayesian Inference

no code implementations29 Aug 2022 Soumyabrata Talukder, Souvik Kundu, Ratnesh Kumar

Most sensor calibrations rely on the linearity and steadiness of their response characteristics, but practical sensors are nonlinear, and their response drifts with time, restricting their choices for adoption.

Bayesian Inference

Federated Learning of Large Models at the Edge via Principal Sub-Model Training

1 code implementation28 Aug 2022 Yue Niu, Saurav Prakash, Souvik Kundu, Sunwoo Lee, Salman Avestimehr

However, the heterogeneous-client setting requires some clients to train full model, which is not aligned with the resource-constrained setting; while the latter ones break privacy promises in FL when sharing intermediate representations or labels with the server.

Federated Learning

Lottery Aware Sparsity Hunting: Enabling Federated Learning on Resource-Limited Edge

1 code implementation27 Aug 2022 Sara Babakniya, Souvik Kundu, Saurav Prakash, Yue Niu, Salman Avestimehr

A possible solution to this problem is to utilize off-the-shelf sparse learning algorithms at the clients to meet their resource budget.

Federated Learning Model Compression +1

Implementation of fast ICA using memristor crossbar arrays for blind image source separations

no code implementations7 Aug 2022 Pavan Kumar Reddy Boppidi, Victor Jeffry Louis, Arvind Subramaniam, Rajesh K. Tripathy, Souri Banerjee, Souvik Kundu

The experimental results demonstrate that the proposed approach is very effective to separate image sources, and also the contrast of the images are improved with an improvement factor in terms of percentage of structural similarity as 67. 27% when compared with the software-based implementation of conventional ACY ICA and Fast ICA algorithms.

blind source separation

A Fast and Efficient Conditional Learning for Tunable Trade-Off between Accuracy and Robustness

no code implementations28 Mar 2022 Souvik Kundu, Sairam Sundaresan, Massoud Pedram, Peter A. Beerel

In this paper, we present a fast learnable once-for-all adversarial training (FLOAT) algorithm, which instead of the existing FiLM-based conditioning, presents a unique weight conditioned learning that requires no additional layer, thereby incurring no significant increase in parameter count, training time, or network latency compared to standard adversarial training.

Image Classification

P2M: A Processing-in-Pixel-in-Memory Paradigm for Resource-Constrained TinyML Applications

no code implementations7 Mar 2022 Gourav Datta, Souvik Kundu, Zihan Yin, Ravi Teja Lakkireddy, Joe Mathai, Ajey Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

Visual data in such cameras are usually captured in the form of analog voltages by a sensor pixel array, and then converted to the digital domain for subsequent AI processing using analog-to-digital converters (ADC).

BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch

no code implementations24 Dec 2021 Souvik Kundu, Shikai Wang, Qirui Sun, Peter A. Beerel, Massoud Pedram

Compared to the baseline FP-32 models, BMPQ can yield models that have 15. 4x fewer parameter bits with a negligible drop in accuracy.

Quantization

Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation

no code implementations NeurIPS 2021 Souvik Kundu, Qirui Sun, Yao Fu, Massoud Pedram, Peter Beerel

Knowledge distillation (KD) has recently been identified as a method that can unintentionally leak private information regarding the details of a teacher model to an unauthorized student.

Knowledge Distillation

Pipeline Parallelism for Inference on Heterogeneous Edge Computing

no code implementations28 Oct 2021 Yang Hu, Connor Imes, Xuanang Zhao, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, John Paul N. Walters

We propose EdgePipe, a distributed framework for edge systems that uses pipeline parallelism to both speed up inference and enable running larger (and more accurate) models that otherwise cannot fit on single edge devices.

Edge-computing

Understanding of Emotion Perception from Art

no code implementations13 Oct 2021 Digbalay Bose, Krishna Somandepalli, Souvik Kundu, Rimita Lahiri, Jonathan Gratch, Shrikanth Narayanan

Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals.

HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training with Crafted Input Noise

1 code implementation ICCV 2021 Souvik Kundu, Massoud Pedram, Peter A. Beerel

Low-latency deep spiking neural networks (SNNs) have become a promising alternative to conventional artificial neural networks (ANNs) because of their potential for increased energy efficiency on event-driven neuromorphic hardware.

FLOAT: FAST LEARNABLE ONCE-FOR-ALL ADVERSARIAL TRAINING FOR TUNABLE TRADE-OFF BETWEEN ACCURACY AND ROBUSTNESS

no code implementations29 Sep 2021 Souvik Kundu, Peter Anthony Beerel, Sairam Sundaresan

In this paper, we present Fast Learnable Once-for-all Adversarial Training (FLOAT) which transforms the weight tensors without using extra layers, thereby incurring no significant increase in parameter count, training time, or network latency compared to a standard adversarial training.

Image Classification

Training Energy-Efficient Deep Spiking Neural Networks with Single-Spike Hybrid Input Encoding

no code implementations26 Jul 2021 Gourav Datta, Souvik Kundu, Peter A. Beerel

This paper presents a training framework for low-latency energy-efficient SNNs that uses a hybrid encoding scheme at the input layer in which the analog pixel values of an image are directly applied during the first timestep and a novel variant of spike temporal coding is used during subsequent timesteps.

Computational Efficiency Image Classification

HYPER-SNN: Towards Energy-efficient Quantized Deep Spiking Neural Networks for Hyperspectral Image Classification

no code implementations26 Jul 2021 Gourav Datta, Souvik Kundu, Akhilesh R. Jaiswal, Peter A. Beerel

However, the accurate processing of the spectral and spatial correlation between the bands requires the use of energy-expensive 3-D Convolutional Neural Networks (CNNs).

Computational Efficiency Hyperspectral Image Classification +1

Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression

no code implementations16 Jul 2021 Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

To evaluate the merits of our approach, we performed experiments with variants of VGG and ResNet, on both CIFAR-10 and CIFAR-100, and VGG16 on Tiny-ImageNet. The SNN models generated through the proposed technique yield SOTA compression ratios of up to 33. 4x with no significant drops in accuracy compared to baseline unpruned counterparts.

Sparse Learning

AttentionLite: Towards Efficient Self-Attention Models for Vision

no code implementations21 Dec 2020 Souvik Kundu, Sairam Sundaresan

We propose a novel framework for producing a class of parameter and compute efficient models called AttentionLitesuitable for resource-constrained applications.

Knowledge Distillation

Attention-based Image Upsampling

no code implementations17 Dec 2020 Souvik Kundu, Hesham Mostafa, Sharath Nittur Sridhar, Sairam Sundaresan

Convolutional layers are an integral part of many deep neural network solutions in computer vision.

Image Classification Image Super-Resolution +2

A Co-Attentive Cross-Lingual Neural Model for Dialogue Breakdown Detection

1 code implementation COLING 2020 Qian Lin, Souvik Kundu, Hwee Tou Ng

One of the major challenges is that a dialogue system may generate an undesired utterance leading to a dialogue breakdown, which degrades the overall interaction quality.

Language Modelling Word Embeddings

A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

1 code implementation3 Nov 2020 Souvik Kundu, Mahdi Nazemi, Peter A. Beerel, Massoud Pedram

This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images.

Image Classification Model Compression

Learning to Identify Follow-Up Questions in Conversational Question Answering

no code implementations ACL 2020 Souvik Kundu, Qian Lin, Hwee Tou Ng

Despite recent progress in conversational question answering, most prior work does not focus on follow-up questions.

Conversational Question Answering

Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks

1 code implementation29 Jan 2020 Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel

We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2.

A Pre-defined Sparse Kernel Based Convolution for Deep CNNs

no code implementations2 Oct 2019 Souvik Kundu, Saurav Prakash, Haleh Akrami, Peter A. Beerel, Keith M. Chugg

To explore the potential of this approach, we have experimented with two widely accepted datasets, CIFAR-10 and Tiny ImageNet, in sparse variants of both the ResNet18 and VGG16 architectures.

Exploiting Explicit Paths for Multi-hop Reading Comprehension

1 code implementation ACL 2019 Souvik Kundu, Tushar Khot, Ashish Sabharwal, Peter Clark

To capture additional context, PathNet also composes the passage representations along each path to compute a passage-based representation.

Implicit Relations Knowledge Graphs +1

A Nil-Aware Answer Extraction Framework for Question Answering

1 code implementation EMNLP 2018 Souvik Kundu, Hwee Tou Ng

However, current approaches suffer from an impractical assumption that every question has a valid answer in the associated passage.

Question Answering Reading Comprehension +1

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

1 code implementation31 May 2018 Sourya Dey, Diandian Chen, Zongyang Li, Souvik Kundu, Kuan-Wen Huang, Keith M. Chugg, Peter A. Beerel

We demonstrate an FPGA implementation of a parallel and reconfigurable architecture for sparse neural networks, capable of on-chip training and inference.

A Question-Focused Multi-Factor Attention Network for Question Answering

1 code implementation25 Jan 2018 Souvik Kundu, Hwee Tou Ng

Neural network models recently proposed for question answering (QA) primarily focus on capturing the passage-question relation.

Open-Domain Question Answering Reading Comprehension +2

Cannot find the paper you are looking for? You can Submit a new open access paper.