Search Results for author: Ishan Misra

Found 40 papers, 26 papers with code

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

1 code implementation16 Feb 2022 Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski

Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images.

 Ranked #1 on Copy Detection on Copydays strong subset (using extra training data)

Action Classification Action Recognition +10

A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments

no code implementations16 Feb 2022 Randall Balestriero, Ishan Misra, Yann Lecun

We show that for a training loss to be stable under DA sampling, the model's saliency map (gradient of the loss with respect to the model's input) must align with the smallest eigenvector of the sample variance under the considered DA augmentation, hinting at a possible explanation on why models tend to shift their focus from edges to textures.

Data Augmentation

Omnivore: A Single Model for Many Visual Modalities

1 code implementation20 Jan 2022 Rohit Girdhar, Mannat Singh, Nikhila Ravi, Laurens van der Maaten, Armand Joulin, Ishan Misra

Prior work has studied different visual modalities in isolation and developed separate architectures for recognition of images, videos, and 3D data.

 Ranked #1 on Scene Recognition on SUN-RGBD (using extra training data)

Action Classification Action Recognition +3

Detecting Twenty-thousand Classes using Image-level Supervision

1 code implementation7 Jan 2022 Xingyi Zhou, Rohit Girdhar, Armand Joulin, Phillip Krähenbühl, Ishan Misra

For the first time, we train a detector with all the twenty-one-thousand classes of the ImageNet dataset and show that it generalizes to new datasets without fine-tuning.

Image Classification

Mask2Former for Video Instance Segmentation

1 code implementation20 Dec 2021 Bowen Cheng, Anwesa Choudhuri, Ishan Misra, Alexander Kirillov, Rohit Girdhar, Alexander G. Schwing

We find Mask2Former also achieves state-of-the-art performance on video instance segmentation without modifying the architecture, the loss or even the training pipeline.

Instance Segmentation Panoptic Segmentation +3

3D Spatial Recognition without Spatially Labeled 3D

no code implementations CVPR 2021 Zhongzheng Ren, Ishan Misra, Alexander G. Schwing, Rohit Girdhar

We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition, requiring only scene-level class tags as supervision.

3D Object Detection Multiple Instance Learning +2

Emerging Properties in Self-Supervised Vision Transformers

16 code implementations ICCV 2021 Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).

Copy Detection Self-Supervised Image Classification +5

Robust Audio-Visual Instance Discrimination

no code implementations CVPR 2021 Pedro Morgado, Ishan Misra, Nuno Vasconcelos

Second, since self-supervised contrastive learning relies on random sampling of negative instances, instances that are semantically similar to the base instance can be used as faulty negatives.

Action Recognition Contrastive Learning +2

Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning

1 code implementation ICCV 2021 Mandela Patrick, Yuki M. Asano, Bernie Huang, Ishan Misra, Florian Metze, Joao Henriques, Andrea Vedaldi

First, for space, we show that spatial augmentations such as cropping do work well for videos too, but that previous implementations, due to the high processing and memory cost, could not do this at a scale sufficient for it to work well.

Representation Learning Self-Supervised Learning

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

15 code implementations4 Mar 2021 Jure Zbontar, Li Jing, Ishan Misra, Yann Lecun, Stéphane Deny

This causes the embedding vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors.

General Classification Object Detection +3

Self-Supervised Pretraining of 3D Features on any Point-Cloud

1 code implementation ICCV 2021 Zaiwei Zhang, Rohit Girdhar, Armand Joulin, Ishan Misra

Pretraining on large labeled datasets is a prerequisite to achieve good performance in many computer vision tasks like 2D object recognition, video classification etc.

Object Detection Object Recognition +2

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

12 code implementations NeurIPS 2020 Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin

In addition, we also propose a new data augmentation strategy, multi-crop, that uses a mix of views with different resolutions in place of two full-resolution views, without increasing the memory or compute requirements much.

Contrastive Learning Data Augmentation +2

In Defense of Grid Features for Visual Question Answering

2 code implementations CVPR 2020 Huaizu Jiang, Ishan Misra, Marcus Rohrbach, Erik Learned-Miller, Xinlei Chen

Popularized as 'bottom-up' attention, bounding box (or region) based visual features have recently surpassed vanilla grid-based convolutional features as the de facto standard for vision and language tasks like visual question answering (VQA).

Image Captioning Question Answering +2

ClusterFit: Improving Generalization of Visual Representations

1 code implementation CVPR 2020 Xueting Yan, Ishan Misra, Abhinav Gupta, Deepti Ghadiyaram, Dhruv Mahajan

Pre-training convolutional neural networks with weakly-supervised and self-supervised strategies is becoming increasingly popular for several computer vision tasks.

Action Classification Image Classification +1

Self-Supervised Learning of Pretext-Invariant Representations

7 code implementations CVPR 2020 Ishan Misra, Laurens van der Maaten

The goal of self-supervised learning from images is to construct image representations that are semantically meaningful via pretext tasks that do not require semantic annotations for a large training set of images.

Object Detection Representation Learning +3

Does Object Recognition Work for Everyone?

no code implementations6 Jun 2019 Terrance DeVries, Ishan Misra, Changhan Wang, Laurens van der Maaten

The paper analyzes the accuracy of publicly available object-recognition systems on a geographically diverse dataset.

Object Recognition

Evaluating Text-to-Image Matching using Binary Image Selection (BISON)

no code implementations19 Jan 2019 Hexiang Hu, Ishan Misra, Laurens van der Maaten

Providing systems the ability to relate linguistic and visual content is one of the hallmarks of computer vision.

Image Captioning Image Retrieval

Learning by Asking Questions

no code implementations CVPR 2018 Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, Laurens van der Maaten

We also show that our model asks questions that generalize to state-of-the-art VQA models and to novel test time distributions.

Question Answering Visual Question Answering +1

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

2 code implementations ICCV 2017 Debidatta Dwibedi, Ishan Misra, Martial Hebert

In this paper, we propose a simple approach to generate large annotated instance datasets with minimal effort.

Object Detection

From Red Wine to Red Tomato: Composition With Context

no code implementations CVPR 2017 Ishan Misra, Abhinav Gupta, Martial Hebert

In this paper, we present a simple method that respects contextuality in order to compose classifiers of known visual concepts.

Generating Natural Questions About an Image

1 code implementation ACL 2016 Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiaodong He, Lucy Vanderwende

There has been an explosion of work in the vision & language community during the past few years from image captioning to video transcription, and answering questions about images.

Image Captioning Question Generation

Seeing through the Human Reporting Bias: Visual Classifiers from Noisy Human-Centric Labels

no code implementations CVPR 2016 Ishan Misra, C. Lawrence Zitnick, Margaret Mitchell, Ross Girshick

When human annotators are given a choice about what to label in an image, they apply their own subjective judgments on what to ignore and what to mention.

Image Captioning Image Classification

Watch and Learn: Semi-Supervised Learning of Object Detectors from Videos

no code implementations21 May 2015 Ishan Misra, Abhinav Shrivastava, Martial Hebert

We present a semi-supervised approach that localizes multiple unknown object instances in long videos.

Frame Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.