Search Results for author: Mohammad Rastegari

Found 55 papers, 25 papers with code

Adding Unlabeled Samples to Categories by Learned Attributes

no code implementations • CVPR 2013 • Jonghyun Choi, Mohammad Rastegari, Ali Farhadi, Larry S. Davis

We propose a method to expand the visual coverage of training sets that consist of a small number of labeled examples using learned attributes.

Paper
Add Code

Multi-attribute Queries: To Merge or Not to Merge?

no code implementations • CVPR 2013 • Mohammad Rastegari, Ali Diba, Devi Parikh, Ali Farhadi

We exploit a discriminative binary space to compute these geometric quantities efficiently.

Attribute Image Retrieval

Paper
Add Code

Comparing apples to apples in the evaluation of binary coding methods

no code implementations • 5 May 2014 • Mohammad Rastegari, Shobeir Fakhraei, Jonghyun Choi, David Jacobs, Larry S. Davis

We discuss methodological issues related to the evaluation of unsupervised binary code construction methods for nearest neighbor search.

Paper
Add Code

Class Consistent Multi-Modal Fusion With Binary Features

no code implementations • CVPR 2015 • Ashish Shrivastava, Mohammad Rastegari, Sumit Shekhar, Rama Chellappa, Larry S. Davis

Many existing recognition algorithms combine different modalities based on training accuracy but do not consider the possibility of noise at test time.

Paper
Add Code

Computationally Bounded Retrieval

no code implementations • CVPR 2015 • Mohammad Rastegari, Cem Keskin, Pushmeet Kohli, Shahram Izadi

We demonstrate this technique on large retrieval databases, specifically ImageNET, GIST1M and SUN-attribute for the task of nearest neighbor retrieval, and show that our method achieves a speed-up of up to a factor of 100 over state-of-the-art methods, while having on-par and in some cases even better accuracy.

Attribute Image Retrieval +1

Paper
Add Code

Discriminative and Consistent Similarities in Instance-Level Multiple Instance Learning

no code implementations • CVPR 2015 • Mohammad Rastegari, Hannaneh Hajishirzi, Ali Farhadi

In this paper we present a bottom-up method to instance-level Multiple Instance Learning (MIL) that learns to discover positive instances with globally constrained reasoning about local pairwise similarities.

Multiple Instance Learning Text Categorization

Paper
Add Code

On Large-Scale Retrieval: Binary or n-ary Coding?

no code implementations • 20 Sep 2015 • Mahyar Najibi, Mohammad Rastegari, Larry S. Davis

To make large-scale search feasible, Distance Estimation and Subset Indexing are the main approaches.

Image Retrieval Quantization +1

Paper
Add Code

Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images

no code implementations • 12 Nov 2015 • Roozbeh Mottaghi, Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi

Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging.

Object

Paper
Add Code

Action Recognition with Image Based CNN Features

no code implementations • 13 Dec 2015 • Mahdyar Ravanbakhsh, Hossein Mousavi, Mohammad Rastegari, Vittorio Murino, Larry S. Davis

Action recognition tasks usually relies on complex handcrafted structures as features to represent the human action model.

Action Recognition Temporal Action Localization

Paper
Add Code

G-CNN: an Iterative Grid Based Object Detector

no code implementations • CVPR 2016 • Mahyar Najibi, Mohammad Rastegari, Larry S. Davis

G-CNN starts with a multi-scale grid of fixed bounding boxes.

2k Object +2

Paper
Add Code

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks

18 code implementations • 16 Mar 2016 • Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, Ali Farhadi

We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks.

Ranked #10 on Classification with Binary Neural Network on ImageNet

Binarization Classification with Binary Neural Network +1

349

Paper
Code

"What happens if..." Learning to Predict the Effect of Forces in Images

no code implementations • 17 Mar 2016 • Roozbeh Mottaghi, Mohammad Rastegari, Abhinav Gupta, Ali Farhadi

To build a dataset of forces in scenes, we reconstructed all images in SUN RGB-D dataset in a physics simulator to estimate the physical movements of objects caused by external forces applied to them.

Paper
Add Code

Newtonian Scene Understanding: Unfolding the Dynamics of Objects in Static Images

no code implementations • CVPR 2016 • Roozbeh Mottaghi, Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi

Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging.

Object Scene Understanding

Paper
Add Code

CNN-aware Binary Map for General Semantic Segmentation

no code implementations • 29 Sep 2016 • Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi, Mohammad Rastegari, Carlo Regazzoni

To the best of our knowledge our method is the first attempt on general semantic image segmentation using CNN.

Clustering Image Segmentation +2

Paper
Add Code

LCNN: Lookup-based Convolutional Neural Network

no code implementations • CVPR 2017 • Hessam Bagherinezhad, Mohammad Rastegari, Ali Farhadi

We introduce LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs.

Few-Shot Learning

Paper
Add Code

IQA: Visual Question Answering in Interactive Environments

1 code implementation • CVPR 2018 • Daniel Gordon, Aniruddha Kembhavi, Mohammad Rastegari, Joseph Redmon, Dieter Fox, Ali Farhadi

Our experiments show that our proposed model outperforms popular single controller based methods on IQUAD V1.

Navigate Visual Question Answering

122

Paper
Code

ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation

8 code implementations • ECCV 2018 • Sachin Mehta, Mohammad Rastegari, Anat Caspi, Linda Shapiro, Hannaneh Hajishirzi

We introduce a fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints.

Ranked #48 on Semantic Segmentation on PASCAL VOC 2012 test

Panoptic Segmentation Real-Time Semantic Segmentation +1

8,218

Paper
Code

Label Refinery: Improving ImageNet Classification through Label Progression

4 code implementations • 7 May 2018 • Hessam Bagherinezhad, Maxwell Horton, Mohammad Rastegari, Ali Farhadi

Among the three main components (data, labels, and models) of any supervised learning system, data and models have been the main subjects of active research.

Classification General Classification

280

Paper
Code

Pyramidal Recurrent Unit for Language Modeling

2 code implementations • EMNLP 2018 • Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi

We introduce the Pyramidal Recurrent Unit (PRU), which enables learning representations in high dimensional space with more generalization power and fewer parameters.

Language Modelling

Paper
Code

ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network

9 code implementations • CVPR 2019 • Sachin Mehta, Mohammad Rastegari, Linda Shapiro, Hannaneh Hajishirzi

Compared to YOLOv2 on the MS-COCO object detection, ESPNetv2 delivers 4. 4% higher accuracy with 6x fewer FLOPs.

Ranked #41 on Semantic Segmentation on PASCAL VOC 2012 test

General Classification Image Classification +5

8,218

Paper
Code

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

2 code implementations • CVPR 2019 • Mitchell Wortsman, Kiana Ehsani, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi

In this paper we study the problem of learning to learn at both training and test time in the context of visual navigation.

Ranked #2 on Visual Navigation on AI2-THOR

Meta-Learning Meta Reinforcement Learning +1

185

Paper
Code

ELASTIC: Improving CNNs with Dynamic Scaling Policies

1 code implementation • CVPR 2019 • Huiyu Wang, Aniruddha Kembhavi, Ali Farhadi, Alan Yuille, Mohammad Rastegari

We formulate the scaling policy as a non-linear function inside the network's structure that (a) is learned from data, (b) is instance specific, (c) does not add extra computation, and (d) can be applied on any network architecture.

General Classification Multi-Label Classification +1

Paper
Code

Two Body Problem: Collaborative Visual Task Completion

no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi

Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.

Task 2 Vocal Bursts Valence Prediction

Paper
Add Code

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

1 code implementation • CVPR 2019 • Kenneth Marino, Mohammad Rastegari, Ali Farhadi, Roozbeh Mottaghi

In this paper, we address the task of knowledge-based visual question answering and provide a benchmark, called OK-VQA, where the image content is not sufficient to answer the questions, encouraging methods that rely on external knowledge resources.

object-detection Object Detection +3

Paper
Code

Discovering Neural Wirings

4 code implementations • NeurIPS 2019 • Mitchell Wortsman, Ali Farhadi, Mohammad Rastegari

In this work we propose a method for discovering neural wirings.

Ranked #8 on Network Pruning on ImageNet - ResNet 50 - 90% sparsity

Feature Engineering Network Pruning +1

140

Paper
Code

Butterfly Transform: An Efficient FFT Based Neural Architecture Design

1 code implementation • CVPR 2020 • Keivan Alizadeh Vahid, Anish Prabhu, Ali Farhadi, Mohammad Rastegari

By replacing pointwise convolutions with BFT, we reduce the computational complexity of these layers from O(n^2) to O(n\log n) with respect to the number of channels.

Neural Architecture Search

Paper
Code

DiCENet: Dimension-wise Convolutions for Efficient Networks

2 code implementations • 8 Jun 2019 • Sachin Mehta, Hannaneh Hajishirzi, Mohammad Rastegari

When DiCE units are stacked to build the DiCENet model, we observe significant improvements over state-of-the-art models across various computer vision tasks including image classification, object detection, and semantic segmentation.

Ranked #21 on Semantic Segmentation on PASCAL VOC 2012 val

Image Classification Neural Architecture Search +3

2,917

Paper
Code

Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors

no code implementations • CVPR 2019 • Mohammad Mahdi Derakhshani, Saeed Masoudnia, Amir Hossein Shaker, Omid Mersa, Mohammad Amin Sadeghi, Mohammad Rastegari, Babak N. Araabi

We present a simple and effective learning technique that significantly improves mAP of YOLO object detectors without compromising their speed.

Object

Paper
Add Code

DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling

1 code implementation • ICLR 2020 • Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi

For sequence models with large vocabularies, a majority of network parameters lie in the input and output layers.

Machine Translation Translation +1

460

Paper
Code

What's Hidden in a Randomly Weighted Neural Network?

3 code implementations • CVPR 2020 • Vivek Ramanujan, Mitchell Wortsman, Aniruddha Kembhavi, Ali Farhadi, Mohammad Rastegari

Training a neural network is synonymous with learning the values of the weights.

Ranked #912 on Image Classification on ImageNet

Image Classification

178

Paper
Code

Supermasks in Superposition

2 code implementations • NeurIPS 2020 • Mitchell Wortsman, Vivek Ramanujan, Rosanne Liu, Aniruddha Kembhavi, Mohammad Rastegari, Jason Yosinski, Ali Farhadi

We present the Supermasks in Superposition (SupSup) model, capable of sequentially learning thousands of tasks without catastrophic forgetting.

114

Paper
Code

Layer-Wise Data-Free CNN Compression

no code implementations • 18 Nov 2020 • Maxwell Horton, Yanzi Jin, Ali Farhadi, Mohammad Rastegari

We also show how to precondition the network to improve the accuracy of our layer-wise compression method.

Quantization

Paper
Add Code

Learning Neural Network Subspaces

1 code implementation • 20 Feb 2021 • Mitchell Wortsman, Maxwell Horton, Carlos Guestrin, Ali Farhadi, Mohammad Rastegari

Recent observations have advanced our understanding of the neural network optimization landscape, revealing the existence of (1) paths of high accuracy containing diverse solutions and (2) wider minima offering improved performance.

126

Paper
Code

DKM: Differentiable K-Means Clustering Layer for Neural Network Compression

no code implementations • ICLR 2022 • Minsik Cho, Keivan A. Vahid, Saurabh Adya, Mohammad Rastegari

For MobileNet-v1, which is a challenging DNN to compress, DKM delivers 63. 9% top-1 ImageNet1k accuracy with 0. 72 MB model size (22. 4x model compression factor).

Clustering Neural Network Compression

Paper
Add Code

MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer

25 code implementations • ICLR 2022 • Sachin Mehta, Mohammad Rastegari

Light-weight convolutional neural networks (CNNs) are the de-facto for mobile vision tasks.

Ranked #765 on Image Classification on ImageNet

Image Classification object-detection +1

29,648

Paper
Code

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time

1 code implementation • 8 Oct 2021 • Elvis Nunez, Maxwell Horton, Anish Prabhu, Anurag Ranjan, Ali Farhadi, Mohammad Rastegari

Our models require no retraining, thus our subspace of models can be deployed entirely on-device to allow adaptive network compression at inference time.

Quantization

Paper
Code

Token Pooling in Vision Transformers

no code implementations • 8 Oct 2021 • Dmitrii Marin, Jen-Hao Rick Chang, Anurag Ranjan, Anish Prabhu, Mohammad Rastegari, Oncel Tuzel

Token Pooling is a simple and effective operator that can benefit many architectures.

Paper
Add Code

CVNets: High Performance Library for Computer Vision

3 code implementations • 4 Jun 2022 • Sachin Mehta, Farzad Abdolhosseini, Mohammad Rastegari

We introduce CVNets, a high-performance open-source library for training deep neural networks for visual recognition tasks, including classification, detection, and segmentation.

Video Understanding Vocal Bursts Intensity Prediction

1,665

Paper
Code

Separable Self-attention for Mobile Vision Transformers

5 code implementations • 6 Jun 2022 • Sachin Mehta, Mohammad Rastegari

The improved model, MobileViTv2, is state-of-the-art on several mobile vision tasks, including ImageNet object classification and MS-COCO object detection.

Object Detection

29,648

Paper
Code

SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks

1 code implementation • 21 Jul 2022 • Chien-Yu Lin, Anish Prabhu, Thomas Merth, Sachin Mehta, Anurag Ranjan, Maxwell Horton, Mohammad Rastegari

In this paper, we perform an empirical evaluation on methods for sharing parameters in isotropic networks (SPIN).

Neural Network Compression

Paper
Code

RangeAugment: Efficient Online Augmentation with Range Learning

1 code implementation • 20 Dec 2022 • Sachin Mehta, Saeid Naderiparizi, Fartash Faghri, Maxwell Horton, Lailin Chen, Ali Farhadi, Oncel Tuzel, Mohammad Rastegari

To answer the open question on the importance of magnitude ranges for each augmentation operation, we introduce RangeAugment that allows us to efficiently learn the range of magnitudes for individual as well as composite augmentation operations.

Knowledge Distillation object-detection +3

1,665

Paper
Code

Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement

1 code implementation • ICCV 2023 • Fartash Faghri, Hadi Pouransari, Sachin Mehta, Mehrdad Farajtabar, Ali Farhadi, Mohammad Rastegari, Oncel Tuzel

Models pretrained on ImageNet+ and fine-tuned on CIFAR-100+, Flowers-102+, and Food-101+, reach up to 3. 4% improved accuracy.

Data Augmentation Knowledge Distillation +2

Paper
Code

Bytes Are All You Need: Transformers Operating Directly On File Bytes

1 code implementation • 31 May 2023 • Maxwell Horton, Sachin Mehta, Ali Farhadi, Mohammad Rastegari

Our model, \emph{ByteFormer}, achieves an ImageNet Top-1 classification accuracy of $77. 33\%$ when training and testing directly on TIFF file bytes using a transformer backbone with configuration similar to DeiT-Ti ($72. 2\%$ accuracy when operating on RGB images).

Classification Image Classification +1

1,665

Paper
Code

eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

no code implementations • 2 Sep 2023 • Minsik Cho, Keivan A. Vahid, Qichen Fu, Saurabh Adya, Carlo C Del Mundo, Mohammad Rastegari, Devang Naik, Peter Zatloukal

Since Large Language Models or LLMs have demonstrated high-quality performance on many complex language tasks, there is a great interest in bringing these LLMs to mobile devices for faster responses and better privacy protection.

Clustering Quantization

Paper
Add Code

On the Efficacy of Multi-scale Data Samplers for Vision Applications

no code implementations • 8 Sep 2023 • Elvis Nunez, Thomas Merth, Anish Prabhu, Mehrdad Farajtabar, Mohammad Rastegari, Sachin Mehta, Maxwell Horton

Multi-scale resolution training has seen an increased adoption across multiple vision tasks, including classification and detection.

Instance Segmentation Semantic Segmentation

Paper
Add Code

Do Compressed LLMs Forget Knowledge? An Experimental Study with Practical Implications

no code implementations • 2 Oct 2023 • Duc N. M Hoang, Minsik Cho, Thomas Merth, Mohammad Rastegari, Zhangyang Wang

We start by proposing two conjectures on the nature of the damage: one is certain knowledge being forgotten (or erased) after LLM compression, hence necessitating the compressed model to (re)learn from data with additional parameters; the other presumes that knowledge is internally displaced and hence one requires merely "inference re-direction" with input-side augmentation such as prompting, to recover the knowledge-related performance.

Paper
Add Code

Diffusion Models as Masked Audio-Video Learners

no code implementations • 5 Oct 2023 • Elvis Nunez, Yanzi Jin, Mohammad Rastegari, Sachin Mehta, Maxwell Horton

Over the past several years, the synchronization between audio and visual signals has been leveraged to learn richer audio-visual representations.

Audio Classification Contrastive Learning

Paper
Add Code

ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models

1 code implementation • 6 Oct 2023 • Iman Mirzadeh, Keivan Alizadeh, Sachin Mehta, Carlo C Del Mundo, Oncel Tuzel, Golnoosh Samei, Mohammad Rastegari, Mehrdad Farajtabar

Large Language Models (LLMs) with billions of parameters have drastically transformed AI applications.

6,912

Paper
Code

CLIP meets Model Zoo Experts: Pseudo-Supervision for Visual Enhancement

no code implementations • 21 Oct 2023 • Mohammadreza Salehi, Mehrdad Farajtabar, Maxwell Horton, Fartash Faghri, Hadi Pouransari, Raviteja Vemulapalli, Oncel Tuzel, Ali Farhadi, Mohammad Rastegari, Sachin Mehta

While CLIP is scalable, promptable, and robust to distribution shifts on image classification tasks, it lacks object localization capabilities.

Depth Estimation Image Classification +3

Paper
Add Code

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

no code implementations • 23 Oct 2023 • Haoxiang Wang, Pavan Kumar Anasosalu Vasu, Fartash Faghri, Raviteja Vemulapalli, Mehrdad Farajtabar, Sachin Mehta, Mohammad Rastegari, Oncel Tuzel, Hadi Pouransari

By applying our method to SAM and CLIP, we obtain SAM-CLIP: a unified model that combines the capabilities of SAM and CLIP into a single vision transformer.

Continual Learning Multi-Task Learning +2

Paper
Add Code

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models

no code implementations • 30 Nov 2023 • Raviteja Vemulapalli, Hadi Pouransari, Fartash Faghri, Sachin Mehta, Mehrdad Farajtabar, Mohammad Rastegari, Oncel Tuzel

Motivated by this, we ask the following important question, "How can we leverage the knowledge from a large VFM to train a small task-specific model for a new target task with limited labeled training data?

Image Retrieval Retrieval +1

Paper
Add Code

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

no code implementations • 12 Dec 2023 • Keivan Alizadeh, Iman Mirzadeh, Dmitry Belenko, Karen Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar

These methods collectively enable running models up to twice the size of the available DRAM, with a 4-5x and 20-25x increase in inference speed compared to naive loading approaches in CPU and GPU, respectively.

Ranked #63 on Sentence Completion on HellaSwag

Language Modelling Large Language Model +1

Paper
Add Code

Weight subcloning: direct initialization of transformers using larger pretrained ones

no code implementations • 14 Dec 2023 • Mohammad Samragh, Mehrdad Farajtabar, Sachin Mehta, Raviteja Vemulapalli, Fartash Faghri, Devang Naik, Oncel Tuzel, Mohammad Rastegari

The usual practice of transfer learning overcomes this challenge by initializing the model with weights of a pretrained model of the same size and specification to increase the convergence and training speed.

Image Classification Transfer Learning

Paper
Add Code

Speculative Streaming: Fast LLM Inference without Auxiliary Models

no code implementations • 16 Feb 2024 • Nikhil Bhendawade, Irina Belousova, Qichen Fu, Henry Mason, Mohammad Rastegari, Mahyar Najibi

Speculative decoding is a prominent technique to speed up the inference of a large target language model based on predictions of an auxiliary draft model.

Language Modelling

Paper
Add Code

Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation

no code implementations • 10 Apr 2024 • Thomas Merth, Qichen Fu, Mohammad Rastegari, Mahyar Najibi

Despite the successes of large language models (LLMs), they exhibit significant drawbacks, particularly when processing long contexts.

Question Answering Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.