Search Results for author: Fartash Faghri

Found 24 papers, 14 papers with code

Weight subcloning: direct initialization of transformers using larger pretrained ones

no code implementations14 Dec 2023 Mohammad Samragh, Mehrdad Farajtabar, Sachin Mehta, Raviteja Vemulapalli, Fartash Faghri, Devang Naik, Oncel Tuzel, Mohammad Rastegari

The usual practice of transfer learning overcomes this challenge by initializing the model with weights of a pretrained model of the same size and specification to increase the convergence and training speed.

Image Classification Transfer Learning

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models

no code implementations30 Nov 2023 Raviteja Vemulapalli, Hadi Pouransari, Fartash Faghri, Sachin Mehta, Mehrdad Farajtabar, Mohammad Rastegari, Oncel Tuzel

Motivated by this, we ask the following important question, "How can we leverage the knowledge from a large VFM to train a small task-specific model for a new target task with limited labeled training data?

Image Retrieval Retrieval +1

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training

1 code implementation28 Nov 2023 Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Raviteja Vemulapalli, Oncel Tuzel

We further demonstrate the effectiveness of our multi-modal reinforced training by training a CLIP model based on ViT-B/16 image backbone and achieving +2. 9% average performance improvement on 38 evaluation benchmarks compared to the previous best.

Image Captioning Transfer Learning +1

TiC-CLIP: Continual Training of CLIP Models

1 code implementation24 Oct 2023 Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari, Raviteja Vemulapalli, Sachin Mehta, Oncel Tuzel, Vaishaal Shankar, Fartash Faghri

We introduce the first set of web-scale Time-Continual (TiC) benchmarks for training vision-language models: TiC-DataComp, TiC-YFCC, and TiC-Redcaps.

Continual Learning Retrieval

FastFill: Efficient Compatible Model Update

1 code implementation8 Mar 2023 Florian Jaeckle, Fartash Faghri, Ali Farhadi, Oncel Tuzel, Hadi Pouransari

The task of retrieving the most similar data from a gallery set to a given query data is performed through a similarity comparison on features.

Representation Learning Retrieval

RangeAugment: Efficient Online Augmentation with Range Learning

1 code implementation20 Dec 2022 Sachin Mehta, Saeid Naderiparizi, Fartash Faghri, Maxwell Horton, Lailin Chen, Ali Farhadi, Oncel Tuzel, Mohammad Rastegari

To answer the open question on the importance of magnitude ranges for each augmentation operation, we introduce RangeAugment that allows us to efficiently learn the range of magnitudes for individual as well as composite augmentation operations.

Knowledge Distillation object-detection +3

APE: Aligning Pretrained Encoders to Quickly Learn Aligned Multimodal Representations

no code implementations8 Oct 2022 Elan Rosenfeld, Preetum Nakkiran, Hadi Pouransari, Oncel Tuzel, Fartash Faghri

Recent advances in learning aligned multimodal representations have been primarily driven by training large neural networks on massive, noisy paired-modality datasets.

Zero-Shot Learning

MixTailor: Mixed Gradient Aggregation for Robust Learning Against Tailored Attacks

no code implementations16 Jul 2022 Ali Ramezani-Kebrya, Iman Tabrizian, Fartash Faghri, Petar Popovski

We introduce MixTailor, a scheme based on randomization of the aggregation strategies that makes it impossible for the attacker to be fully informed.

Training Efficiency and Robustness in Deep Learning

1 code implementation2 Dec 2021 Fartash Faghri

We show that a redundancy-aware modification to the sampling of training data improves the training speed and develops an efficient method for detecting the diversity of training signal, namely, gradient clustering.

Adversarial Robustness

NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

no code implementations28 Apr 2021 Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training.

Quantization

Bridging the Gap Between Adversarial Robustness and Optimization Bias

1 code implementation17 Feb 2021 Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.

Adversarial Robustness

A Study of Gradient Variance in Deep Learning

1 code implementation9 Jul 2020 Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba

We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.

Clustering

SOAR: Second-Order Adversarial Regularization

no code implementations4 Apr 2020 Avery Ma, Fartash Faghri, Nicolas Papernot, Amir-Massoud Farahmand

Adversarial training is a common approach to improving the robustness of deep neural networks against adversarial examples.

Adversarial Robustness

Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

no code implementations25 Sep 2019 Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed on clusters to perform model fitting in parallel.

Quantization

NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization

1 code implementation16 Aug 2019 Ali Ramezani-Kebrya, Fartash Faghri, Daniel M. Roy

As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed on clusters to perform model fitting in parallel.

Quantization

Adversarial Spheres

2 code implementations ICLR 2018 Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, Ian Goodfellow

We hypothesize that this counter intuitive behavior is a naturally occurring result of the high dimensional geometry of the data manifold.

Adversarial Manipulation of Deep Representations

2 code implementations16 Nov 2015 Sara Sabour, Yanshuai Cao, Fartash Faghri, David J. Fleet

We show that the representation of an image in a deep neural network (DNN) can be manipulated to mimic those of other natural images, with only minor, imperceptible perturbations to the original image.

Cannot find the paper you are looking for? You can Submit a new open access paper.