The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model.
Ranked #1 on Domain Generalization on ImageNet-Sketch (using extra training data)
Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate.
We present an efficient method for joint optimization of topology, materials and lighting from multi-view image observations.
For the first time, we train a detector with all the twenty-one-thousand classes of the ImageNet dataset and show that it generalizes to new datasets without fine-tuning.
The second edition of Deep Learning Interviews is home to hundreds of fully-solved problems, from a wide range of key topics in AI.
Transformers have attracted increasing interests in computer vision, but they still fall behind state-of-the-art convolutional networks.
Ranked #1 on Image Classification on ImageNet (using extra training data)
In this paper, we ask the following question: is it possible to combine the strengths of CNNs and ViTs to build a light-weight and low latency network for mobile vision tasks?
Ranked #379 on Image Classification on ImageNet
In addition, we present a transfer learning method used to extract critical features from the EEG group dataset and then to customize the model to the single individual by training its late layers with only 12-min individual-related data.