Search Results for author: Pavan Kumar Anasosalu Vasu

Found 6 papers, 4 papers with code

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

1 code implementation28 Nov 2023 Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Raviteja Vemulapalli, Oncel Tuzel

We further demonstrate the effectiveness of our multi-modal reinforced training by training a CLIP model based on ViT-B/16 image backbone and achieving +2. 9% average performance improvement on 38 evaluation benchmarks compared to the previous best.

Image Captioning Transfer Learning +1

FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization

4 code implementations ICCV 2023 Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, Anurag Ranjan

To this end, we introduce a novel token mixing operator, RepMixer, a building block of FastViT, that uses structural reparameterization to lower the memory access cost by removing skip-connections in the network.

Image Classification

MobileOne: An Improved One millisecond Mobile Backbone

7 code implementations CVPR 2023 Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu, Oncel Tuzel, Anurag Ranjan

Furthermore, we show that our model generalizes to multiple tasks - image classification, object detection, and semantic segmentation with significant improvements in latency and accuracy as compared to existing efficient architectures when deployed on a mobile device.

Efficient Neural Network Image Classification +2

Forward Compatible Training for Large-Scale Embedding Retrieval Systems

1 code implementation CVPR 2022 Vivek Ramanujan, Pavan Kumar Anasosalu Vasu, Ali Farhadi, Oncel Tuzel, Hadi Pouransari

To avoid the cost of backfilling, BCT modifies training of the new model to make its representations compatible with those of the old model.

Representation Learning Retrieval

Instance-Level Task Parameters: A Robust Multi-task Weighting Framework

no code implementations11 Jun 2021 Pavan Kumar Anasosalu Vasu, Shreyas Saxena, Oncel Tuzel

When applied to datasets where one or more tasks can have noisy annotations, the proposed method learns to prioritize learning from clean labels for a given task, e. g. reducing surface estimation errors by up to 60%.

Depth Estimation Multi-Task Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.