1 code implementation • 13 Mar 2024 • Zhuang Liu, Kaiming He
We revisit the "dataset classification" experiment suggested by Torralba and Efros a decade ago, in the new era with large-scale, diverse, and hopefully less biased datasets as well as more capable neural network architectures.
1 code implementation • 25 Jan 2024 • Xinlei Chen, Zhuang Liu, Saining Xie, Kaiming He
In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation.
1 code implementation • 6 Dec 2023 • Tianhong Li, Dina Katabi, Kaiming He
This gap can be attributed to the lack of semantic information provided by labels.
Ranked #1 on Unconditional Image Generation on ImageNet 256x256
4 code implementations • CVPR 2023 • Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
We present Fast Language-Image Pre-training (FLIP), a simple and more efficient method for training CLIP.
3 code implementations • 18 May 2022 • Christoph Feichtenhofer, Haoqi Fan, Yanghao Li, Kaiming He
We randomly mask out spacetime patches in videos and learn an autoencoder to reconstruct them in pixels.
6 code implementations • 30 Mar 2022 • Yanghao Li, Hanzi Mao, Ross Girshick, Kaiming He
This design enables the original ViT architecture to be fine-tuned for object detection without needing to redesign a hierarchical backbone for pre-training.
Ranked #5 on Instance Segmentation on LVIS v1.0 val
2 code implementations • 22 Nov 2021 • Yanghao Li, Saining Xie, Xinlei Chen, Piotr Dollar, Kaiming He, Ross Girshick
The complexity of object detection methods can make this benchmarking non-trivial when new architectures, such as Vision Transformer (ViT) models, arrive.
49 code implementations • CVPR 2022 • Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick
Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.
Ranked #1 on Out-of-Distribution Generalization on ImageNet-W
2 code implementations • CVPR 2021 • Christoph Feichtenhofer, Haoqi Fan, Bo Xiong, Ross Girshick, Kaiming He
We present a large-scale study on unsupervised spatiotemporal representation learning from videos.
Ranked #3 on Self-Supervised Action Recognition on HMDB51
Representation Learning Self-Supervised Action Recognition +1
8 code implementations • ICCV 2021 • Xinlei Chen, Saining Xie, Kaiming He
In this work, we go back to basics and investigate the effects of several fundamental components for training self-supervised ViT.
Ranked #1 on Out-of-Distribution Generalization on ImageNet-W
Out-of-Distribution Generalization Self-Supervised Image Classification +1
26 code implementations • CVPR 2021 • Xinlei Chen, Kaiming He
Our experiments show that collapsing solutions do exist for the loss and structure, but a stop-gradient operation plays an essential role in preventing collapsing.
Ranked #94 on Self-Supervised Image Classification on ImageNet
Representation Learning Self-Supervised Image Classification
3 code implementations • ICML 2020 • Jiaxuan You, Jure Leskovec, Kaiming He, Saining Xie
Neural networks are often represented as graphs of connections between neurons.
23 code implementations • CVPR 2020 • Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollár
In this work, we present a new network design paradigm.
Ranked #1 on Out-of-Distribution Generalization on ImageNet-W
2 code implementations • ECCV 2020 • Chenxi Liu, Piotr Dollár, Kaiming He, Ross Girshick, Alan Yuille, Saining Xie
Existing neural network architectures in computer vision -- whether designed by humans or by machines -- were typically found using both images and their associated labels.
36 code implementations • 9 Mar 2020 • Xinlei Chen, Haoqi Fan, Ross Girshick, Kaiming He
Contrastive unsupervised learning has recently shown encouraging progress, e. g., in Momentum Contrast (MoCo) and SimCLR.
Ranked #3 on Contrastive Learning on imagenet-1k
14 code implementations • CVPR 2020 • Alexander Kirillov, Yuxin Wu, Kaiming He, Ross Girshick
We present a new method for efficient high-quality image segmentation of objects and scenes.
Ranked #3 on Instance Segmentation on COCO 2017 val
3 code implementations • CVPR 2020 • Chao-yuan Wu, Ross Girshick, Kaiming He, Christoph Feichtenhofer, Philipp Krähenbühl
We empirically demonstrate a general and robust grid schedule that yields a significant out-of-the-box training speedup without a loss in accuracy for different models (I3D, non-local, SlowFast), datasets (Kinetics, Something-Something, Charades), and training settings (with and without pre-training, 128 GPUs or 1 GPU).
Ranked #1 on Video Classification on Kinetics
45 code implementations • CVPR 2020 • Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, Ross Girshick
This enables building a large and consistent dictionary on-the-fly that facilitates contrastive unsupervised learning.
Ranked #11 on Contrastive Learning on imagenet-1k
13 code implementations • ICCV 2019 • Charles R. Qi, Or Litany, Kaiming He, Leonidas J. Guibas
Current 3D object detection methods are heavily influenced by 2D detectors.
3D Object Detection 3D Object Detection From Monocular Images +2
9 code implementations • ICCV 2019 • Saining Xie, Alexander Kirillov, Ross Girshick, Kaiming He
In this paper, we explore a more diverse set of connectivity patterns through the lens of randomly wired neural networks.
Ranked #118 on Neural Architecture Search on ImageNet
2 code implementations • ICCV 2019 • Xinlei Chen, Ross Girshick, Kaiming He, Piotr Dollár
To formalize this, we treat dense instance segmentation as a prediction task over 4D tensors and present a general framework called TensorMask that explicitly captures this geometry and enables novel operators on 4D tensors.
Ranked #90 on Instance Segmentation on COCO test-dev
12 code implementations • CVPR 2019 • Alexander Kirillov, Ross Girshick, Kaiming He, Piotr Dollár
In this work, we perform a detailed study of this minimally extended version of Mask R-CNN with FPN, which we refer to as Panoptic FPN, and show it is a robust and accurate baseline for both tasks.
Ranked #4 on Panoptic Segmentation on Indian Driving Dataset
4 code implementations • CVPR 2019 • Chao-yuan Wu, Christoph Feichtenhofer, Haoqi Fan, Kaiming He, Philipp Krähenbühl, Ross Girshick
To understand the world, we humans constantly need to relate the present to the past, and put events in context.
Ranked #4 on Egocentric Activity Recognition on EPIC-KITCHENS-55
15 code implementations • ICCV 2019 • Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, Kaiming He
We present SlowFast networks for video recognition.
Ranked #4 on Action Recognition on AVA v2.1
2 code implementations • CVPR 2019 • Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan Yuille, Kaiming He
This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks.
Ranked #1 on Adversarial Defense on CAAD 2018
no code implementations • NeurIPS 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun
We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.
1 code implementation • ICCV 2019 • Kaiming He, Ross Girshick, Piotr Dollár
We report competitive results on object detection and instance segmentation on the COCO dataset using standard models trained from random initialization.
Ranked #81 on Object Detection on COCO minival
1 code implementation • 14 Jun 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann Lecun
We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels.
4 code implementations • ECCV 2018 • Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, Laurens van der Maaten
ImageNet classification is the de facto pretraining task for these models.
Ranked #221 on Image Classification on ImageNet (using extra training data)
18 code implementations • ECCV 2018 • Yuxin Wu, Kaiming He
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Ranked #140 on Object Detection on COCO minival
9 code implementations • CVPR 2019 • Alexander Kirillov, Kaiming He, Ross Girshick, Carsten Rother, Piotr Dollár
We propose and study a task we name panoptic segmentation (PS).
Ranked #23 on Panoptic Segmentation on Cityscapes val (using extra training data)
4 code implementations • CVPR 2018 • Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He
We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data.
3 code implementations • CVPR 2018 • Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, Ross Girshick
Most methods for object instance segmentation require all training examples to be labeled with segmentation masks.
32 code implementations • CVPR 2018 • Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He
Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.
Ranked #8 on Action Classification on Toyota Smarthome dataset (using extra training data)
no code implementations • ICCV 2017 • Xiaolong Wang, Kaiming He, Abhinav Gupta
The objects are connected by two types of edges which correspond to two types of invariance: "different instances but a similar viewpoint and category" and "different viewpoints of the same instance".
231 code implementations • ICCV 2017 • Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár
Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
Ranked #3 on Region Proposal on COCO test-dev
70 code implementations • 8 Jun 2017 • Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, Kaiming He
To achieve this result, we adopt a hyper-parameter-free linear scaling rule for adjusting learning rates as a function of minibatch size and develop a new warmup scheme that overcomes optimization challenges early in training.
2 code implementations • CVPR 2018 • Georgia Gkioxari, Ross Girshick, Piotr Dollár, Kaiming He
Our hypothesis is that the appearance of a person -- their pose, clothing, action -- is a powerful cue for localizing the objects they are interacting with.
Ranked #53 on Human-Object Interaction Detection on HICO-DET
172 code implementations • ICCV 2017 • Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick
Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.
Ranked #1 on Keypoint Estimation on GRIT
85 code implementations • CVPR 2017 • Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, Serge Belongie
Feature pyramids are a basic component in recognition systems for detecting objects at different scales.
Ranked #3 on Pedestrian Detection on TJU-Ped-campus
58 code implementations • CVPR 2017 • Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He
Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set.
Ranked #3 on Image Classification on GasHisSDB
no code implementations • 24 Jul 2016 • Liliang Zhang, Liang Lin, Xiaodan Liang, Kaiming He
Detecting pedestrian has been arguably addressed as a special topic beyond general object detection.
Ranked #19 on Pedestrian Detection on Caltech
49 code implementations • NeurIPS 2016 • Jifeng Dai, Yi Li, Kaiming He, Jian Sun
In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image.
Ranked #4 on Real-Time Object Detection on PASCAL VOC 2007
no code implementations • CVPR 2016 • Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, Jian Sun
Large-scale data is of crucial importance for learning semantic segmentation models, but annotating per-pixel masks is a tedious and inefficient procedure.
no code implementations • 29 Mar 2016 • Jifeng Dai, Kaiming He, Yi Li, Shaoqing Ren, Jian Sun
In contrast to the previous FCN that generates one score map, our FCN is designed to compute a small set of instance-sensitive score maps, each of which is the outcome of a pixel-wise classifier of a relative position to instances.
55 code implementations • 16 Mar 2016 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors.
Ranked #17 on Image Classification on Kuzushiji-MNIST
2 code implementations • CVPR 2016 • Jifeng Dai, Kaiming He, Jian Sun
We develop an algorithm for the nontrivial end-to-end training of this causal, cascaded structure.
Ranked #3 on Multi-Human Parsing on PASCAL-Part
469 code implementations • CVPR 2016 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Ranked #1 on Image Classification on cifar100
195 code implementations • NeurIPS 2015 • Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.
Ranked #2 on Vessel Detection on Vessel detection Dateset
no code implementations • CVPR 2015 • Dongping Li, Kaiming He, Jian Sun, Kun Zhou
The image projections will turn the straight lines into curved "geodesic lines", and it is fundamentally impossible to keep all these lines straight.
no code implementations • CVPR 2015 • Yan Xia, Kaiming He, Pushmeet Kohli, Jian Sun
This paper addresses the problem of learning long binary codes from high-dimensional data.
no code implementations • 26 May 2015 • Xiangyu Zhang, Jianhua Zou, Kaiming He, Jian Sun
This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community.
6 code implementations • 5 May 2015 • Kaiming He, Jian Sun
The guided filter is a technique for edge-aware image filtering.
no code implementations • 23 Apr 2015 • Shaoqing Ren, Kaiming He, Ross Girshick, Xiangyu Zhang, Jian Sun
We discover that aside from deep feature maps, a deep and convolutional per-region classifier is of particular importance for object detection, whereas latest superior image classification models (such as ResNets and GoogLeNets) do not directly lead to good detection accuracy without using such a per-region classifier.
no code implementations • ICCV 2015 • Jifeng Dai, Kaiming He, Jian Sun
Recent leading approaches to semantic segmentation rely on deep convolutional networks trained with human-annotated, pixel-level segmentation masks.
Ranked #46 on Semantic Segmentation on PASCAL VOC 2012 test
16 code implementations • ICCV 2015 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
In this work, we study rectifier neural networks for image classification from two aspects.
60 code implementations • 31 Dec 2014 • Chao Dong, Chen Change Loy, Kaiming He, Xiaoou Tang
We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network.
Ranked #2 on Video Super-Resolution on Xiph HD - 4x upscaling
no code implementations • CVPR 2015 • Kaiming He, Jian Sun
Though recent advanced convolutional neural networks (CNNs) have been improving the image recognition accuracy, the models are getting more complex and time-consuming.
1 code implementation • CVPR 2015 • Jifeng Dai, Kaiming He, Jian Sun
The current leading approaches for semantic segmentation exploit shape information by extracting CNN features from masked image regions.
Ranked #61 on Semantic Segmentation on PASCAL Context
no code implementations • CVPR 2015 • Xiangyu Zhang, Jianhua Zou, Xiang Ming, Kaiming He, Jian Sun
This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs).
14 code implementations • 18 Jun 2014 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale.
Ranked #26 on Object Detection on PASCAL VOC 2007
no code implementations • CVPR 2014 • Tiezheng Ge, Kaiming He, Jian Sun
In this paper, we study a special case of sparse coding in which the codebook is a Cartesian product of two subcodebooks.
no code implementations • CVPR 2013 • Kaiming He, Fang Wen, Jian Sun
We propose a novel Affinity-Preserving K-means algorithm which simultaneously performs k-means clustering and learns the binary indices of the quantized cells.
no code implementations • CVPR 2013 • Tiezheng Ge, Kaiming He, Qifa Ke, Jian Sun
Product quantization is an effective vector quantization approach to compactly encode high-dimensional vectors for fast approximate nearest neighbor (ANN) search.
2 code implementations • IEEE Transactions on Pattern Analysis and Machine Intelligence 2010 • Kaiming He, Jian Sun, Xiaoou Tang
The dark channel prior is a kind of statistics of outdoor haze-free images.
Ranked #1 on Single Image Haze Removal on RESIDE