no code implementations • 19 Dec 2023 • Sharath Nittur Sridhar, Maciej Szankin, Fang Chen, Sairam Sundaresan, Anthony Sarah
In this paper, we demonstrate that by using multi-objective search algorithms paired with lightly trained predictors, we can efficiently search for both the sub-network architecture and the corresponding quantization policy and outperform their respective baselines across different performance objectives such as accuracy, model size, and latency.
no code implementations • 29 Aug 2023 • Sharath Nittur Sridhar, Souvik Kundu, Sairam Sundaresan, Maciej Szankin, Anthony Sarah
However, training super-networks from scratch can be extremely time consuming and compute intensive especially for large models that rely on a two-stage training process of pre-training and fine-tuning.
no code implementations • 14 Jul 2023 • Souvik Kundu, Sharath Nittur Sridhar, Maciej Szankin, Sairam Sundaresan
In this paper, we present Sensi-BERT, a sensitivity driven efficient fine-tuning of BERT models that can take an off-the-shelf pre-trained BERT model and yield highly parameter-efficient models for downstream tasks.
no code implementations • 27 Dec 2022 • Souvik Kundu, Sairam Sundaresan, Sharath Nittur Sridhar, Shunlin Lu, Han Tang, Peter A. Beerel
Existing deep neural networks (DNNs) that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on either activation or weight conditioned convolution operations.
no code implementations • 19 May 2022 • Daniel Cummings, Anthony Sarah, Sharath Nittur Sridhar, Maciej Szankin, Juan Pablo Munoz, Sairam Sundaresan
Recent advances in Neural Architecture Search (NAS) such as one-shot NAS offer the ability to extract specialized hardware-aware sub-network configurations from a task-specific super-network.
no code implementations • 28 Mar 2022 • Souvik Kundu, Sairam Sundaresan, Massoud Pedram, Peter A. Beerel
In this paper, we present a fast learnable once-for-all adversarial training (FLOAT) algorithm, which instead of the existing FiLM-based conditioning, presents a unique weight conditioned learning that requires no additional layer, thereby incurring no significant increase in parameter count, training time, or network latency compared to standard adversarial training.
no code implementations • 25 Feb 2022 • Anthony Sarah, Daniel Cummings, Sharath Nittur Sridhar, Sairam Sundaresan, Maciej Szankin, Tristan Webb, J. Pablo Munoz
These methods decouple the super-network training from the sub-network search and thus decrease the computational burden of specializing to different hardware platforms.
no code implementations • 24 Feb 2022 • Sharath Nittur Sridhar, Anthony Sarah, Sairam Sundaresan
Models based on BERT have been extremely successful in solving a variety of natural language processing (NLP) tasks.
no code implementations • 29 Sep 2021 • Souvik Kundu, Peter Anthony Beerel, Sairam Sundaresan
In this paper, we present Fast Learnable Once-for-all Adversarial Training (FLOAT) which transforms the weight tensors without using extra layers, thereby incurring no significant increase in parameter count, training time, or network latency compared to a standard adversarial training.
no code implementations • 21 Dec 2020 • Souvik Kundu, Sairam Sundaresan
We propose a novel framework for producing a class of parameter and compute efficient models called AttentionLitesuitable for resource-constrained applications.
no code implementations • 17 Dec 2020 • Souvik Kundu, Hesham Mostafa, Sharath Nittur Sridhar, Sairam Sundaresan
Convolutional layers are an integral part of many deep neural network solutions in computer vision.
no code implementations • 2 Dec 2020 • J. Emmanuel Johnson, Sairam Sundaresan, Tansu Daylan, Lisseth Gavilan, Daniel K. Giles, Stela Ishitani Silva, Anna Jungbluth, Brett Morris, Andrés Muñoz-Jaramillo
We harness the power of deep learning and successfully apply Convolutional Neural Networks to regress stellar rotation periods from Kepler light curves.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Zhiyu Chen, Wenhu Chen, Hanwen Zha, Xiyou Zhou, Yunkai Zhang, Sairam Sundaresan, William Yang Wang
If only provided with the table, it is hard for existing models to produce controllable and high-fidelity logical generations.
no code implementations • 19 Apr 2019 • Subarna Tripathi, Sharath Nittur Sridhar, Sairam Sundaresan, Hanlin Tang
Structured representations such as scene graphs serve as an efficient and compact representation that can be used for downstream rendering or retrieval tasks.