These results indicate that aspects of vision transformers other than attention, such as the patch embedding, may be more responsible for their strong performance than previously thought.
Ranked #323 on Image Classification on ImageNet
In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).
Ranked #1 on Copy Detection on Copydays strong subset
In this paper, we explore the open-domain sketch-to-photo translation, which aims to synthesize a realistic photo from a freehand sketch with its class label, even if the sketches of that class are missing in the training data.
Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.
Ranked #3 on Text Classification on TREC-6
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.
Ranked #246 on Image Classification on ImageNet
Although manipulating the latent vectors controls the synthesized outputs, editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder.
However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection.
Ranked #12 on Instance Segmentation on COCO test-dev (APS metric)
Tracking non-rigidly deforming scenes using range sensors has numerous applications including computer vision, AR/VR, and robotics.
This paper proposes a novel method of learning by predicting view assignments with support samples (PAWS).
Ranked #313 on Image Classification on ImageNet
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.
Ranked #1 on Fine-Grained Image Classification on Oxford-IIIT Pets (using extra training data)