Search Results for author: Souvik Kundu

Found 74 papers, 30 papers with code

On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention

no code implementations11 Jun 2025 Yeonju Ro, Zhenyu Zhang, Souvik Kundu, Zhangyang Wang, Aditya Akella

Large language models (LLMs) excel at capturing global token dependencies via self-attention but face prohibitive compute and memory costs on lengthy inputs.

Text Summarization

Fast and Cost-effective Speculative Edge-Cloud Decoding with Early Exits

no code implementations27 May 2025 Yeshwanth Venkatesha, Souvik Kundu, Priyadarshini Panda

To demonstrate real-world applicability, we deploy our method on the Unitree Go2 quadruped robot using Vision-Language Model (VLM) based control, achieving a 21% speedup over traditional cloud-based autoregressive decoding.

Mitigating Hallucinations in Vision-Language Models through Image-Guided Head Suppression

1 code implementation22 May 2025 Sreetama Sarkar, Yue Che, Alex Gavin, Peter A. Beerel, Souvik Kundu

Despite their remarkable progress in multimodal understanding tasks, large vision language models (LVLMs) often suffer from "hallucinations", generating texts misaligned with the visual context.

Hallucination Question Answering +1

Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator

no code implementations19 Apr 2025 Akshat Ramachandran, Souvik Kundu, Arnab Raha, Shamik Kundu, Deepak K. Mathaikutty, Tushar Krishna

FLOW enables the identification of optimal layer-wise N and M values (from a given range) by simultaneously accounting for the presence and distribution of outliers, allowing a higher degree of representational freedom.

Large Language Model

Understanding and Optimizing Multi-Stage AI Inference Pipelines

no code implementations14 Apr 2025 Abhimanyu Rajeshkumar Bambhaniya, Hanjiang Wu, Suvinay Subramanian, Sudarshan Srinivasan, Souvik Kundu, Amir Yazdanbakhsh, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna

Through case studies, we explore the impact of reasoning stages on end-to-end latency, optimal batching strategies for hybrid pipelines, and the architectural implications of remote KV cache retrieval.

Navigate RAG +2

SEAL: Steerable Reasoning Calibration of Large Language Models for Free

1 code implementation7 Apr 2025 Runjin Chen, Zhenyu Zhang, Junyuan Hong, Souvik Kundu, Zhangyang Wang

To address this issue, we investigate the internal reasoning structures of LLMs and categorize them into three primary thought types: execution, reflection, and transition thoughts.

GSM8K

OuroMamba: A Data-Free Quantization Framework for Vision Mamba Models

no code implementations13 Mar 2025 Akshat Ramachandran, Mingyu Lee, Huan Xu, Souvik Kundu, Tushar Krishna

We identify two key challenges in enabling DFQ for VMMs, (1) VMM's recurrent state transitions restricts capturing of long-range interactions and leads to semantically weak synthetic data, (2) VMM activations exhibit dynamic outlier variations across time-steps, rendering existing static PTQ techniques ineffective.

channel selection Contrastive Learning +3

Enhancing Large Language Models for Hardware Verification: A Novel SystemVerilog Assertion Dataset

1 code implementation11 Mar 2025 Anand Menon, Samit S Miftah, Shamik Kundu, Souvik Kundu, Amisha Srivastava, Arnab Raha, Gabriel Theodor Sonnenschein, Suvadeep Banerjee, Deepak Mathaikutty, Kanad Basu

However, proprietary SOTA models like GPT-4o often generate inaccurate assertions and require expensive licenses, while smaller open-source LLMs need fine-tuning to manage HDL code complexities.

LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression

no code implementations6 Mar 2025 Souvik Kundu, Anahita Bhiwandiwalla, Sungduk Yu, Phillip Howard, Tiep Le, Sharath Nittur Sridhar, David Cobbley, Hao Kang, Vasudev Lal

Despite recent efforts in understanding the compression impact on large language models (LLMs) in terms of their downstream task performance and trustworthiness on relatively simpler uni-modal benchmarks (for example, question answering, common sense reasoning), their detailed study on multi-modal Large Vision-Language Models (LVLMs) is yet to be unveiled.

Benchmarking Common Sense Reasoning +8

LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models

1 code implementation10 Feb 2025 Sihwan Park, Doohyuk Jang, Sungyub Kim, Souvik Kundu, Eunho Yang

Recently, relaxed speculative decoding with dynamic tree drafting was proposed to mitigate this ambiguity, demonstrating promising results in accelerating visual AR models.

Text Generation

CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing

no code implementations4 Feb 2025 Wenhao Zheng, Yixiao Chen, Weitong Zhang, Souvik Kundu, Yun Li, Zhengzhong Liu, Eric P. Xing, Hongyi Wang, Huaxiu Yao

This allows the router to learn to predict token-level routing scores and make routing decisions based on both the current token and the future impact of its decisions.

Collaborative Inference Language Modeling +2

Unraveling Zeroth-Order Optimization through the Lens of Low-Dimensional Structured Perturbations

no code implementations31 Jan 2025 Sihwan Park, Jihun Yun, Sungyub Kim, Souvik Kundu, Eunho Yang

In this work, we develop a unified theoretical framework that analyzes both the convergence and generalization properties of ZO optimization under structured perturbations.

Large Language Model

GenBFA: An Evolutionary Optimization Approach to Bit-Flip Attacks on LLMs

no code implementations21 Nov 2024 Sanjay Das, Swastik Bhattacharya, Souvik Kundu, Shamik Kundu, Anand Menon, Arnab Raha, Kanad Basu

Current BFA techniques are inadequate for exploiting this vulnerability due to the difficulty of efficiently identifying critical parameters within the immense parameter space.

MMLU Text Generation

MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization

1 code implementation8 Nov 2024 Akshat Ramachandran, Souvik Kundu, Tushar Krishna

Quantization of foundational models (FMs) is significantly more challenging than traditional DNNs due to the emergence of large magnitude values called outliers.

Quantization

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding

1 code implementation4 Oct 2024 Doohyuk Jang, Sihwan Park, June Yong Yang, Yeonsung Jung, Jihun Yun, Souvik Kundu, Sung-Yub Kim, Eunho Yang

To overcome this challenge, we propose a relaxed acceptance condition referred to as LANTERN that leverages the interchangeability of tokens in latent space.

Image Generation

Understanding the Performance and Estimating the Cost of LLM Fine-Tuning

1 code implementation8 Aug 2024 Yuchen Xia, Jiho Kim, Yuhan Chen, Haojie Ye, Souvik Kundu, Cong, Hao, Nishil Talati

This model, based on parameters of the model and GPU architecture, estimates LLM throughput and the cost of training, aiding practitioners in industry and academia to budget the cost of fine-tuning a specific model.

Mixture-of-Experts

MaskVD: Region Masking for Efficient Video Object Detection

no code implementations16 Jul 2024 Sreetama Sarkar, Gourav Datta, Souvik Kundu, Kai Zheng, Chirayata Bhattacharyya, Peter A. Beerel

Video tasks are compute-heavy and thus pose a challenge when deploying in real-time applications, particularly for tasks that require state-of-the-art Vision Transformers (ViTs).

Object object-detection +1

Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems

2 code implementations9 Jul 2024 Amey Agrawal, Anmol Agarwal, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov

However, these metrics fail to fully capture the nuances of LLM inference, leading to an incomplete assessment of user-facing performance crucial for real-time applications such as chat and translation.

LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation

1 code implementation18 Jun 2024 Seyedarmin Azizi, Souvik Kundu, Massoud Pedram

LaMDA freezes a first projection matrix (PMA) in the adaptation path while introducing a low-dimensional trainable square matrix, resulting in substantial reductions in trainable parameters and peak GPU memory usage.

Natural Language Understanding Text Generation +1

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

1 code implementation10 Jun 2024 Haoran You, Yipin Guo, Yichao Fu, Wei Zhou, Huihong Shi, Xiaofan Zhang, Souvik Kundu, Amir Yazdanbakhsh, Yingyan Celine Lin

Experiments on five LLM families and eight tasks consistently validate the effectiveness of ShiftAddLLM, achieving average perplexity improvements of 5. 6 and 22. 7 points at comparable or lower latency compared to the most competitive quantized LLMs at 3 and 2 bits, respectively, and more than 80% memory and energy reductions over the original LLMs.

Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models

1 code implementation3 Jun 2024 Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan Srinivasan, Suvinay Subramanian, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna

We have validated against real hardware platforms running various different LLM models, achieving a max geomean error of 5. 82. We use GenZ to identify compute, memory capacity, memory bandwidth, network latency, and network bandwidth requirements across diverse LLM inference use cases.

Chunking Mamba +1

GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM

2 code implementations8 Mar 2024 Hao Kang, Qingru Zhang, Souvik Kundu, Geonhwa Jeong, Zaoxing Liu, Tushar Krishna, Tuo Zhao

Key-value (KV) caching has become the de-facto to accelerate generation speed for large language models (LLMs) inference.

Quantization

CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware

no code implementations19 Feb 2024 Souvik Kundu, Anthony Sarah, Vinay Joshi, Om J Omer, Sreenivas Subramoney

With the recent growth in demand for large-scale deep neural networks, compute in-memory (CiM) has come up as a prominent solution to alleviate bandwidth and on-chip interconnect bottlenecks that constrain Von-Neuman architectures.

Neural Architecture Search

Linearizing Models for Efficient yet Robust Private Inference

no code implementations8 Feb 2024 Sreetama Sarkar, Souvik Kundu, Peter A. Beerel

Our experimental evaluations show that RLNet can yield models with up to 11. 14x fewer ReLUs, with accuracy close to the all-ReLU models, on clean, naturally perturbed, and gradient-based perturbed images.

Sparse but Strong: Crafting Adversarially Robust Graph Lottery Tickets

no code implementations11 Dec 2023 Subhajit Dutta Chowdhury, Zhiyu Ni, Qingyuan Peng, Souvik Kundu, Pierluigi Nuzzo

By iteratively applying ARGS to prune both the perturbed graph adjacency matrix and the GNN model weights, we can find adversarially robust graph lottery tickets that are highly sparse yet achieve competitive performance under different untargeted training-time structure attacks.

Graph Neural Network

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data

no code implementations7 Dec 2023 Yuhang Li, Youngeun Kim, DongHyun Lee, Souvik Kundu, Priyadarshini Panda

In the realm of deep neural network deployment, low-bit quantization presents a promising avenue for enhancing computational efficiency.

Computational Efficiency Quantization +1

Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural networks: from Algorithms to Technology

no code implementations2 Dec 2023 Souvik Kundu, Rui-Jie Zhu, Akhilesh Jaiswal, Peter A. Beerel

Neuromorphic computing and, in particular, spiking neural networks (SNNs) have become an attractive alternative to deep neural networks for a broad range of signal processing applications, processing static and/or temporal inputs from different sensory modalities, including audio and vision sensors.

Fusing Models with Complementary Expertise

1 code implementation2 Oct 2023 Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research.

Multiple-choice text-classification +2

Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs

1 code implementation29 Sep 2023 Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang

Contrary to this belief, this paper presents a counter-argument: small-magnitude weights of pre-trained model weights encode vital knowledge essential for tackling difficult downstream tasks - manifested as the monotonic relationship between the performance drop of downstream tasks across the difficulty spectrum, as we prune more pre-trained weights by magnitude.

Quantization

InstaTune: Instantaneous Neural Architecture Search During Fine-Tuning

no code implementations29 Aug 2023 Sharath Nittur Sridhar, Souvik Kundu, Sairam Sundaresan, Maciej Szankin, Anthony Sarah

However, training super-networks from scratch can be extremely time consuming and compute intensive especially for large models that rely on a two-stage training process of pre-training and fine-tuning.

Neural Architecture Search

FireFly A Synthetic Dataset for Ember Detection in Wildfire

1 code implementation6 Aug 2023 Yue Hu, Xinan Ye, Yifei Liu, Souvik Kundu, Gourav Datta, Srikar Mutnuri, Namo Asavisanu, Nora Ayanian, Konstantinos Psounis, Peter Beerel

This paper presents "FireFly", a synthetic dataset for ember detection created using Unreal Engine 4 (UE4), designed to overcome the current lack of ember-specific training resources.

Diversity object-detection +1

Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT

no code implementations14 Jul 2023 Souvik Kundu, Sharath Nittur Sridhar, Maciej Szankin, Sairam Sundaresan

In this paper, we present Sensi-BERT, a sensitivity driven efficient fine-tuning of BERT models that can take an off-the-shelf pre-trained BERT model and yield highly parameter-efficient models for downstream tasks.

QNLI QQP +4

NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations

1 code implementation10 Jun 2023 Yonggan Fu, Ye Yuan, Souvik Kundu, Shang Wu, Shunyao Zhang, Yingyan Celine Lin

Generalizable Neural Radiance Fields (GNeRF) are one of the most promising real-world solutions for novel view synthesis, thanks to their cross-scene generalization capability and thus the possibility of instant rendering on new scenes.

Adversarial Robustness Novel View Synthesis

Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference

no code implementations26 Apr 2023 Souvik Kundu, Yuke Zhang, Dake Chen, Peter A. Beerel

Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference.

Model Optimization

Technology-Circuit-Algorithm Tri-Design for Processing-in-Pixel-in-Memory (P2M)

no code implementations6 Apr 2023 Md Abdullah-Al Kaiser, Gourav Datta, Sreetama Sarkar, Souvik Kundu, Zihan Yin, Manas Garg, Ajey P. Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

The massive amounts of data generated by camera sensors motivate data processing inside pixel arrays, i. e., at the extreme-edge.

ViTA: A Vision Transformer Inference Accelerator for Edge Applications

no code implementations17 Feb 2023 Shashank Nag, Gourav Datta, Souvik Kundu, Nitin Chandrachoodan, Peter A. Beerel

Vision Transformer models, such as ViT, Swin Transformer, and Transformer-in-Transformer, have recently gained significant traction in computer vision tasks due to their ability to capture the global relation between features which leads to superior performance.

Edge-computing

Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference

no code implementations23 Jan 2023 Souvik Kundu, Shunlin Lu, Yuke Zhang, Jacqueline Liu, Peter A. Beerel

For a similar ReLU budget SENet can yield models with ~2. 32% improved classification accuracy, evaluated on CIFAR-100.

SAL-ViT: Towards Latency Efficient Private Inference on ViT using Selective Attention Search with a Learnable Softmax Approximation

no code implementations ICCV 2023 Yuke Zhang, Dake Chen, Souvik Kundu, Chenghao Li, Peter A. Beerel

Then, given our observation that external attention (EA) presents lower PI latency than widely-adopted self-attention (SA) at the cost of accuracy, we present a selective attention search (SAS) method to integrate the strength of EA and SA.

Vision HGNN: An Image is More than a Graph of Nodes

1 code implementation ICCV 2023 Yan Han, Peihao Wang, Souvik Kundu, Ying Ding, Zhangyang Wang

In this paper, we enhance ViG by transcending conventional "pairwise" linkages and harnessing the power of the hypergraph to encapsulate image information.

graph construction Graph Neural Network +4

Sparse Mixture Once-for-all Adversarial Training for Efficient In-Situ Trade-Off Between Accuracy and Robustness of DNNs

no code implementations27 Dec 2022 Souvik Kundu, Sairam Sundaresan, Sharath Nittur Sridhar, Shunlin Lu, Han Tang, Peter A. Beerel

Existing deep neural networks (DNNs) that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on either activation or weight conditioned convolution operations.

All image-classification +1

In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

no code implementations21 Dec 2022 Gourav Datta, Zeyu Liu, Md Abdullah-Al Kaiser, Souvik Kundu, Joe Mathai, Zihan Yin, Ajey P. Jacob, Akhilesh R. Jaiswal, Peter A. Beerel

Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy.

All

Self-Attentive Pooling for Efficient Deep Learning

no code implementations16 Sep 2022 Fang Chen, Gourav Datta, Souvik Kundu, Peter Beerel

With the aggressive down-sampling of the activation maps in the initial layers (providing up to 22x reduction in memory consumption), our approach achieves 1. 43% higher test accuracy compared to SOTA techniques with iso-memory footprints.

Deep Learning

Dynamic Calibration of Nonlinear Sensors with Time-Drifts and Delays by Bayesian Inference

no code implementations29 Aug 2022 Soumyabrata Talukder, Souvik Kundu, Ratnesh Kumar

Most sensor calibrations rely on the linearity and steadiness of their response characteristics, but practical sensors are nonlinear, and their response drifts with time, restricting their choices for adoption.

Bayesian Inference

Federated Learning of Large Models at the Edge via Principal Sub-Model Training

1 code implementation28 Aug 2022 Yue Niu, Saurav Prakash, Souvik Kundu, Sunwoo Lee, Salman Avestimehr

However, the heterogeneous-client setting requires some clients to train full model, which is not aligned with the resource-constrained setting; while the latter ones break privacy promises in FL when sharing intermediate representations or labels with the server.

Federated Learning

Lottery Aware Sparsity Hunting: Enabling Federated Learning on Resource-Limited Edge

1 code implementation27 Aug 2022 Sara Babakniya, Souvik Kundu, Saurav Prakash, Yue Niu, Salman Avestimehr

A possible solution to this problem is to utilize off-the-shelf sparse learning algorithms at the clients to meet their resource budget.

Federated Learning Model Compression +1

Implementation of fast ICA using memristor crossbar arrays for blind image source separations

no code implementations7 Aug 2022 Pavan Kumar Reddy Boppidi, Victor Jeffry Louis, Arvind Subramaniam, Rajesh K. Tripathy, Souri Banerjee, Souvik Kundu

The experimental results demonstrate that the proposed approach is very effective to separate image sources, and also the contrast of the images are improved with an improvement factor in terms of percentage of structural similarity as 67. 27% when compared with the software-based implementation of conventional ACY ICA and Fast ICA algorithms.

blind source separation

A Fast and Efficient Conditional Learning for Tunable Trade-Off between Accuracy and Robustness

no code implementations28 Mar 2022 Souvik Kundu, Sairam Sundaresan, Massoud Pedram, Peter A. Beerel

In this paper, we present a fast learnable once-for-all adversarial training (FLOAT) algorithm, which instead of the existing FiLM-based conditioning, presents a unique weight conditioned learning that requires no additional layer, thereby incurring no significant increase in parameter count, training time, or network latency compared to standard adversarial training.

image-classification Image Classification

P2M: A Processing-in-Pixel-in-Memory Paradigm for Resource-Constrained TinyML Applications

no code implementations7 Mar 2022 Gourav Datta, Souvik Kundu, Zihan Yin, Ravi Teja Lakkireddy, Joe Mathai, Ajey Jacob, Peter A. Beerel, Akhilesh R. Jaiswal

Visual data in such cameras are usually captured in the form of analog voltages by a sensor pixel array, and then converted to the digital domain for subsequent AI processing using analog-to-digital converters (ADC).

BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch

no code implementations24 Dec 2021 Souvik Kundu, Shikai Wang, Qirui Sun, Peter A. Beerel, Massoud Pedram

Compared to the baseline FP-32 models, BMPQ can yield models that have 15. 4x fewer parameter bits with a negligible drop in accuracy.

Quantization

Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation

no code implementations NeurIPS 2021 Souvik Kundu, Qirui Sun, Yao Fu, Massoud Pedram, Peter Beerel

Knowledge distillation (KD) has recently been identified as a method that can unintentionally leak private information regarding the details of a teacher model to an unauthorized student.

Knowledge Distillation

Pipeline Parallelism for Inference on Heterogeneous Edge Computing

no code implementations28 Oct 2021 Yang Hu, Connor Imes, Xuanang Zhao, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, John Paul N. Walters

We propose EdgePipe, a distributed framework for edge systems that uses pipeline parallelism to both speed up inference and enable running larger (and more accurate) models that otherwise cannot fit on single edge devices.

Edge-computing

Understanding of Emotion Perception from Art

no code implementations13 Oct 2021 Digbalay Bose, Krishna Somandepalli, Souvik Kundu, Rimita Lahiri, Jonathan Gratch, Shrikanth Narayanan

Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals.

HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training with Crafted Input Noise

1 code implementation ICCV 2021 Souvik Kundu, Massoud Pedram, Peter A. Beerel

Low-latency deep spiking neural networks (SNNs) have become a promising alternative to conventional artificial neural networks (ANNs) because of their potential for increased energy efficiency on event-driven neuromorphic hardware.

FLOAT: FAST LEARNABLE ONCE-FOR-ALL ADVERSARIAL TRAINING FOR TUNABLE TRADE-OFF BETWEEN ACCURACY AND ROBUSTNESS

no code implementations29 Sep 2021 Souvik Kundu, Peter Anthony Beerel, Sairam Sundaresan

In this paper, we present Fast Learnable Once-for-all Adversarial Training (FLOAT) which transforms the weight tensors without using extra layers, thereby incurring no significant increase in parameter count, training time, or network latency compared to a standard adversarial training.

All image-classification +1

Training Energy-Efficient Deep Spiking Neural Networks with Single-Spike Hybrid Input Encoding

no code implementations26 Jul 2021 Gourav Datta, Souvik Kundu, Peter A. Beerel

This paper presents a training framework for low-latency energy-efficient SNNs that uses a hybrid encoding scheme at the input layer in which the analog pixel values of an image are directly applied during the first timestep and a novel variant of spike temporal coding is used during subsequent timesteps.

Computational Efficiency image-classification +1

HYPER-SNN: Towards Energy-efficient Quantized Deep Spiking Neural Networks for Hyperspectral Image Classification

no code implementations26 Jul 2021 Gourav Datta, Souvik Kundu, Akhilesh R. Jaiswal, Peter A. Beerel

However, the accurate processing of the spectral and spatial correlation between the bands requires the use of energy-expensive 3-D Convolutional Neural Networks (CNNs).

Computational Efficiency Hyperspectral Image Classification +2

Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression

no code implementations16 Jul 2021 Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

To evaluate the merits of our approach, we performed experiments with variants of VGG and ResNet, on both CIFAR-10 and CIFAR-100, and VGG16 on Tiny-ImageNet. The SNN models generated through the proposed technique yield SOTA compression ratios of up to 33. 4x with no significant drops in accuracy compared to baseline unpruned counterparts.

Sparse Learning

AttentionLite: Towards Efficient Self-Attention Models for Vision

no code implementations21 Dec 2020 Souvik Kundu, Sairam Sundaresan

We propose a novel framework for producing a class of parameter and compute efficient models called AttentionLitesuitable for resource-constrained applications.

Knowledge Distillation

Attention-based Image Upsampling

no code implementations17 Dec 2020 Souvik Kundu, Hesham Mostafa, Sharath Nittur Sridhar, Sairam Sundaresan

Convolutional layers are an integral part of many deep neural network solutions in computer vision.

image-classification Image Classification +3

A Co-Attentive Cross-Lingual Neural Model for Dialogue Breakdown Detection

1 code implementation COLING 2020 Qian Lin, Souvik Kundu, Hwee Tou Ng

One of the major challenges is that a dialogue system may generate an undesired utterance leading to a dialogue breakdown, which degrades the overall interaction quality.

Language Modeling Language Modelling +1

A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

1 code implementation3 Nov 2020 Souvik Kundu, Mahdi Nazemi, Peter A. Beerel, Massoud Pedram

This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images.

image-classification Image Classification +1

Learning to Identify Follow-Up Questions in Conversational Question Answering

no code implementations ACL 2020 Souvik Kundu, Qian Lin, Hwee Tou Ng

Despite recent progress in conversational question answering, most prior work does not focus on follow-up questions.

Conversational Question Answering

Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks

1 code implementation29 Jan 2020 Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel

We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2.

A Pre-defined Sparse Kernel Based Convolution for Deep CNNs

no code implementations2 Oct 2019 Souvik Kundu, Saurav Prakash, Haleh Akrami, Peter A. Beerel, Keith M. Chugg

To explore the potential of this approach, we have experimented with two widely accepted datasets, CIFAR-10 and Tiny ImageNet, in sparse variants of both the ResNet18 and VGG16 architectures.

Exploiting Explicit Paths for Multi-hop Reading Comprehension

1 code implementation ACL 2019 Souvik Kundu, Tushar Khot, Ashish Sabharwal, Peter Clark

To capture additional context, PathNet also composes the passage representations along each path to compute a passage-based representation.

Implicit Relations Knowledge Graphs +1

A Nil-Aware Answer Extraction Framework for Question Answering

1 code implementation EMNLP 2018 Souvik Kundu, Hwee Tou Ng

However, current approaches suffer from an impractical assumption that every question has a valid answer in the associated passage.

Question Answering Reading Comprehension +1

A Highly Parallel FPGA Implementation of Sparse Neural Network Training

1 code implementation31 May 2018 Sourya Dey, Diandian Chen, Zongyang Li, Souvik Kundu, Kuan-Wen Huang, Keith M. Chugg, Peter A. Beerel

We demonstrate an FPGA implementation of a parallel and reconfigurable architecture for sparse neural networks, capable of on-chip training and inference.

A Question-Focused Multi-Factor Attention Network for Question Answering

1 code implementation25 Jan 2018 Souvik Kundu, Hwee Tou Ng

Neural network models recently proposed for question answering (QA) primarily focus on capturing the passage-question relation.

Open-Domain Question Answering Reading Comprehension +2

Cannot find the paper you are looking for? You can Submit a new open access paper.