Search Results for author: Shreyas Saxena

Found 10 papers, 2 papers with code

MediSwift: Efficient Sparse Pre-trained Biomedical Language Models

no code implementations • 1 Mar 2024 • Vithursan Thangarasa, Mahmoud Salem, Shreyas Saxena, Kevin Leong, Joel Hestness, Sean Lie

Large language models (LLMs) are typically trained on general source data for various domains, but a recent surge in domain-specific LLMs has shown their potential to outperform general-purpose models in domain-specific tasks (e. g., biomedicine).

Ranked #10 on Question Answering on PubMedQA

Question Answering

Paper
Add Code

Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency

2 code implementations • 21 Mar 2023 • Vithursan Thangarasa, Shreyas Saxena, Abhay Gupta, Sean Lie

Recent research has focused on weight sparsity in neural network training to reduce FLOPs, aiming for improved efficiency (test accuracy w. r. t training FLOPs).

Paper
Code

SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language Models

no code implementations • 18 Mar 2023 • Vithursan Thangarasa, Abhay Gupta, William Marshall, Tianda Li, Kevin Leong, Dennis Decoste, Sean Lie, Shreyas Saxena

In this work, we show the benefits of using unstructured weight sparsity to train only a subset of weights during pre-training (Sparse Pre-training) and then recover the representational capacity by allowing the zeroed weights to learn (Dense Fine-tuning).

Text Generation Text Summarization

Paper
Add Code

Instance-Level Task Parameters: A Robust Multi-task Weighting Framework

no code implementations • 11 Jun 2021 • Pavan Kumar Anasosalu Vasu, Shreyas Saxena, Oncel Tuzel

When applied to datasets where one or more tasks can have noisy annotations, the proposed method learns to prioritize learning from clean labels for a given task, e. g. reducing surface estimation errors by up to 60%.

Depth Estimation Multi-Task Learning +2

Paper
Add Code

Training With Data Dependent Dynamic Learning Rates

no code implementations • 27 May 2021 • Shreyas Saxena, Nidhi Vyas, Dennis Decoste

This setting is widely adopted under the assumption that loss functions for each instance are similar in nature, and hence, a common learning rate can be used.

Image Classification

Paper
Add Code

Dynamic curriculum learning via data parameters for noise robust keyword spotting

no code implementations • 18 Feb 2021 • Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir

We propose dynamic curriculum learning via data parameters for noise robust keyword spotting.

Keyword Spotting

Paper
Add Code

Learning Soft Labels via Meta Learning

no code implementations • 20 Sep 2020 • Nidhi Vyas, Shreyas Saxena, Thomas Voice

One-hot labels do not represent soft decision boundaries among concepts, and hence, models trained on them are prone to overfitting.

Image Classification Meta-Learning

Paper
Add Code

Data Parameters: A New Family of Parameters for Learning a Differentiable Curriculum

1 code implementation • NeurIPS 2019 • Shreyas Saxena, Oncel Tuzel, Dennis Decoste

To the best of our knowledge, our work is the first curriculum learning method to show gains on large scale image classification and detection tasks.

General Classification Image Classification +2

Paper
Code

Learning Unsupervised Visual Grounding Through Semantic Self-Supervision

no code implementations • 17 Mar 2018 • Syed Ashar Javed, Shreyas Saxena, Vineet Gandhi

Localizing natural language phrases in images is a challenging problem that requires joint understanding of both the textual and visual modalities.

Visual Grounding

Paper
Add Code

Convolutional Neural Fabrics

no code implementations • NeurIPS 2016 • Shreyas Saxena, Jakob Verbeek

Despite the success of CNNs, selecting the optimal architecture for a given task remains an open problem.

Image Classification Semantic Segmentation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.