Search Results for author: Hadi Pouransari

Found 24 papers, 12 papers with code

FastVLM: Efficient Vision Encoding for Vision Language Models

1 code implementation17 Dec 2024 Pavan Kumar Anasosalu Vasu, Fartash Faghri, Chun-Liang Li, Cem Koc, Nate True, Albert Antony, Gokul Santhanam, James Gabriel, Peter Grasch, Oncel Tuzel, Hadi Pouransari

At different operational resolutions, the vision encoder of a VLM can be optimized along two axes: reducing encoding latency and minimizing the number of visual tokens passed to the LLM, thereby lowering overall latency.

Promoting cross-modal representations to improve multimodal foundation models for physiological signals

no code implementations21 Oct 2024 Ching Fang, Christopher Sandino, Behrooz Mahasseni, Juri Minxha, Hadi Pouransari, Erdrin Azemi, Ali Moin, Ellen Zippi

However, methods for developing foundation models in healthcare are still in early exploration and it is unclear which pretraining strategies are most effective given the diversity of physiological signals.

Contrastive Learning

MUSCLE: A Model Update Strategy for Compatible LLM Evolution

no code implementations12 Jul 2024 Jessica Echterhoff, Fartash Faghri, Raviteja Vemulapalli, Ting-yao Hu, Chun-Liang Li, Oncel Tuzel, Hadi Pouransari

We propose a training strategy to minimize the extent of instance regression in model updates, involving training of a compatibility adapter that can enhance task fine-tuned language models.

regression

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

1 code implementation21 May 2024 Hadi Pouransari, Chun-Liang Li, Jen-Hao Rick Chang, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Oncel Tuzel

During training, we use variable sequence length and batch-size, sampling simultaneously from all buckets with a curriculum.

2k 8k +1

CLIP with Quality Captions: A Strong Pretraining for Vision Tasks

no code implementations14 May 2024 Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Oncel Tuzel

In this work, we find that simply improving the quality of captions in image-text datasets improves the quality of CLIP's visual representations, resulting in significant improvement on downstream dense prediction vision tasks.

Depth Estimation object-detection +4

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models

1 code implementation30 Nov 2023 Raviteja Vemulapalli, Hadi Pouransari, Fartash Faghri, Sachin Mehta, Mehrdad Farajtabar, Mohammad Rastegari, Oncel Tuzel

Motivated by this, we ask the following important question, "How can we leverage the knowledge from a large VFM to train a small task-specific model for a new target task with limited labeled training data?

Image Retrieval Retrieval +1

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

3 code implementations CVPR 2024 Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Raviteja Vemulapalli, Oncel Tuzel

We further demonstrate the effectiveness of our multi-modal reinforced training by training a CLIP model based on ViT-B/16 image backbone and achieving +2. 9% average performance improvement on 38 evaluation benchmarks compared to the previous best.

Image Captioning Transfer Learning +1

TiC-CLIP: Continual Training of CLIP Models

1 code implementation24 Oct 2023 Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari, Raviteja Vemulapalli, Sachin Mehta, Oncel Tuzel, Vaishaal Shankar, Fartash Faghri

We introduce the first set of web-scale Time-Continual (TiC) benchmarks for training vision-language models: TiC-DataComp, TiC-YFCC, and TiC-Redcaps.

Continual Learning Retrieval

Frequency-Aware Masked Autoencoders for Multimodal Pretraining on Biosignals

1 code implementation12 Sep 2023 Ran Liu, Ellen L. Zippi, Hadi Pouransari, Chris Sandino, Jingping Nie, Hanlin Goh, Erdrin Azemi, Ali Moin

To achieve effective pretraining in the presence of potential distributional shifts, we propose a frequency-aware masked autoencoder ($\texttt{bio}$FAME) that learns to parameterize the representation of biosignals in the frequency space.

FastFill: Efficient Compatible Model Update

1 code implementation8 Mar 2023 Florian Jaeckle, Fartash Faghri, Ali Farhadi, Oncel Tuzel, Hadi Pouransari

The task of retrieving the most similar data from a gallery set to a given query data is performed through a similarity comparison on features.

model Representation Learning +1

APE: Aligning Pretrained Encoders to Quickly Learn Aligned Multimodal Representations

no code implementations8 Oct 2022 Elan Rosenfeld, Preetum Nakkiran, Hadi Pouransari, Oncel Tuzel, Fartash Faghri

Recent advances in learning aligned multimodal representations have been primarily driven by training large neural networks on massive, noisy paired-modality datasets.

Zero-Shot Learning

Forward Compatible Training for Large-Scale Embedding Retrieval Systems

1 code implementation CVPR 2022 Vivek Ramanujan, Pavan Kumar Anasosalu Vasu, Ali Farhadi, Oncel Tuzel, Hadi Pouransari

To avoid the cost of backfilling, BCT modifies training of the new model to make its representations compatible with those of the old model.

Representation Learning Retrieval

Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution

no code implementations30 Jun 2020 Hadi Pouransari, Mojan Javaheripi, Vinay Sharma, Oncel Tuzel

We propose extracurricular learning, a novel knowledge distillation method, that bridges this gap by (1) modeling student and teacher output distributions; (2) sampling examples from an approximation to the underlying data distribution; and (3) matching student and teacher output distributions over this extended set including uncertain samples.

Image Classification Knowledge Distillation +2

Least squares binary quantization of neural networks

1 code implementation9 Jan 2020 Hadi Pouransari, Zhucheng Tu, Oncel Tuzel

We conduct experiments on the ImageNet dataset and show a reduced accuracy gap when using the proposed least squares quantization algorithms.

Quantization

OPTIMAL BINARY QUANTIZATION FOR DEEP NEURAL NETWORKS

no code implementations25 Sep 2019 Hadi Pouransari, Oncel Tuzel

We conduct experiments on the ImageNet dataset and show a reduced accuracy gap when using the proposed optimal quantization algorithms.

Quantization

Democratizing Production-Scale Distributed Deep Learning

no code implementations31 Oct 2018 Minghuang Ma, Hadi Pouransari, Daniel Chao, Saurabh Adya, Santiago Akle Serrano, Yi Qin, Dan Gimnicher, Dominic Walsh

The interest and demand for training deep neural networks have been experiencing rapid growth, spanning a wide range of applications in both academia and industry.

Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.