Search Results for author: Shreyas Saxena

Found 10 papers, 2 papers with code

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

no code implementations1 Mar 2024 Vithursan Thangarasa, Mahmoud Salem, Shreyas Saxena, Kevin Leong, Joel Hestness, Sean Lie

Large language models (LLMs) are typically trained on general source data for various domains, but a recent surge in domain-specific LLMs has shown their potential to outperform general-purpose models in domain-specific tasks (e. g., biomedicine).

Question Answering

Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

2 code implementations21 Mar 2023 Vithursan Thangarasa, Shreyas Saxena, Abhay Gupta, Sean Lie

Recent research has focused on weight sparsity in neural network training to reduce FLOPs, aiming for improved efficiency (test accuracy w. r. t training FLOPs).

SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models

no code implementations18 Mar 2023 Vithursan Thangarasa, Abhay Gupta, William Marshall, Tianda Li, Kevin Leong, Dennis Decoste, Sean Lie, Shreyas Saxena

In this work, we show the benefits of using unstructured weight sparsity to train only a subset of weights during pre-training (Sparse Pre-training) and then recover the representational capacity by allowing the zeroed weights to learn (Dense Fine-tuning).

Text Generation Text Summarization

Instance-Level Task Parameters: A Robust Multi-task Weighting Framework

no code implementations11 Jun 2021 Pavan Kumar Anasosalu Vasu, Shreyas Saxena, Oncel Tuzel

When applied to datasets where one or more tasks can have noisy annotations, the proposed method learns to prioritize learning from clean labels for a given task, e. g. reducing surface estimation errors by up to 60%.

Depth Estimation Multi-Task Learning +2

Training With Data Dependent Dynamic Learning Rates

no code implementations27 May 2021 Shreyas Saxena, Nidhi Vyas, Dennis Decoste

This setting is widely adopted under the assumption that loss functions for each instance are similar in nature, and hence, a common learning rate can be used.

Image Classification

Learning Soft Labels via Meta Learning

no code implementations20 Sep 2020 Nidhi Vyas, Shreyas Saxena, Thomas Voice

One-hot labels do not represent soft decision boundaries among concepts, and hence, models trained on them are prone to overfitting.

Image Classification Meta-Learning

Data Parameters: A New Family of Parameters for Learning a Differentiable Curriculum

1 code implementation NeurIPS 2019 Shreyas Saxena, Oncel Tuzel, Dennis Decoste

To the best of our knowledge, our work is the first curriculum learning method to show gains on large scale image classification and detection tasks.

General Classification Image Classification +2

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

no code implementations17 Mar 2018 Syed Ashar Javed, Shreyas Saxena, Vineet Gandhi

Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities.

Visual Grounding

Convolutional Neural Fabrics

no code implementations NeurIPS 2016 Shreyas Saxena, Jakob Verbeek

Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem.

Image Classification Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.